Commit Graph

139 Commits

Author SHA1 Message Date
a10569b5ab
New model format
Use Model objects and binary serialization format
2022-11-23 17:01:04 -08:00
f4ae5f851d
Hash words and ngrams 2022-11-23 12:53:01 -08:00
1d1ccbb7cc
Add min_count 2022-11-23 11:42:58 -08:00
e17c79c231
Remove obsolete licensing note in README 2022-11-23 11:34:55 -08:00
af1d1749d2
Refactor word count dict in compiler
This makes future changes to the algorithm much simpler.
2022-11-23 11:33:40 -08:00
aea35ad059
Switch to GPL 2022-11-23 11:28:27 -08:00
30a2ebe33e
Bump version to 2.1.3 2022-11-22 11:47:40 -08:00
4cb8b71407
Merge branch 'master' of https://git.kj7rrv.com/kj7rrv/gptc 2022-11-22 11:46:13 -08:00
7d1cbcaee0
Make sure text is lowercase 2022-11-22 11:44:13 -08:00
82524345f3 Update 'README.md' 2022-09-23 19:15:16 -07:00
c2cd6f62fb Revert "Switch to statistics.stdev"
This reverts commit 76df1dc56d.

Fix major performance regression
2022-07-22 14:45:43 -07:00
76df1dc56d Switch to statistics.stdev 2022-07-22 14:22:01 -07:00
ad138b37d6 Bump version to 2.1.2 2022-07-21 11:49:59 -07:00
3634a10aeb Fix another emoji bug 2022-07-21 11:49:35 -07:00
7250787228 Bump version to 2.1.1 2022-07-20 14:06:56 -07:00
9538cf8c22 Fix emoji handling 2022-07-20 14:06:27 -07:00
185692790f Add emoji checks, improve docs 2022-07-19 19:15:59 -07:00
73b800d60d remove python -m 2022-07-19 17:02:57 -07:00
b61ad35ae7 Bump version to v2.1.0 2022-07-19 17:01:08 -07:00
dc6eb48625 Optional dependency on emoji 2022-07-19 16:43:39 -07:00
ff8cba84c7 format pack.py 2022-07-19 16:02:05 -07:00
8c6dd0bde9 Type checks for pack 2022-07-19 10:43:10 -07:00
5082c2226b Move pack to main module; format code 2022-07-18 16:03:58 -07:00
e711767d24 Add type checks to all functions that need them 2022-07-17 18:42:38 -07:00
67ac3a4591 Working type checks 2022-07-17 18:22:19 -07:00
b36d8e6081 Fix annotations 2022-07-17 17:08:11 -07:00
48639f5d8d Non-working type checks 2022-07-17 16:51:19 -07:00
a207e281e7 Format code with black 2022-07-17 16:28:04 -07:00
e272ab42d1 Document emojis 2022-07-17 16:27:16 -07:00
bd0028a108 Add emoji support to tokenizer 2022-07-17 16:14:02 -07:00
62c3c27ddd Remove license field from pyproject.toml 2022-07-14 20:06:48 -07:00
70e1f542df Fix pyproject.toml syntax 2022-07-14 20:02:07 -07:00
f6b3e942d5 Fix license in pyproject.toml 2022-07-14 19:58:22 -07:00
80428e1bbf Switch to pyproject.toml 2022-07-14 19:18:16 -07:00
1e04769e9d Bump version to 2.0.0 2022-07-14 15:46:59 -07:00
ce80647bbb Add ngrams
First git commit from new laptop!
2022-07-13 11:45:17 -07:00
c54c639b2f Add benchmark script 2022-07-05 16:29:32 -07:00
b133facd70 Recompile model 2022-05-21 14:03:08 -07:00
4188045b75 New CLI tool 2022-05-21 14:02:20 -07:00
e06f2def24 Make pack a function 2022-05-21 13:09:53 -07:00
bebd286163 Fix heading 2022-05-21 12:54:31 -07:00
2d9f7cfc5a Add blank lines before and after headings in README 2022-05-20 17:22:37 -07:00
5378be9418 Format code 2022-05-20 17:16:00 -07:00
4ddeefad07 Remove automatic recompilation 2022-05-18 15:14:08 -07:00
75bae768b6 Update 'README.md' 2022-04-02 11:11:04 -07:00
34af3a8a0a Weighting
Weights words based on the standard deviation of the per-word
confidences; closes #5
2022-03-07 12:13:09 -08:00
4d93b245e8 Switch to LGPL v3 or later 2022-03-05 10:17:17 -08:00
9d1483e445 Update copyright to 2022 2022-03-05 09:43:38 -08:00
4b1e82514f Reformat code with black 2022-03-05 09:42:52 -08:00
252cbaeb9d Merge branch 'master' of https://github.com/kj7rrv/gptc 2022-02-02 19:45:01 -08:00