Treat numbers as words #12

Closed
opened 2022-11-26 17:12:10 -08:00 by kj7rrv · 0 comments
Owner

Numbers should be treated as words by the tokenizer. It would be great to include commas and periods within numbers as well, but this might not be feasible.

Input: Testing 123,456.789.Test
Ideal output: ['testing', '123,456.789', 'test']
Acceptable output: ['testing', '123', '456', '789', 'test']

Numbers should be treated as words by the tokenizer. It would be great to include commas and periods within numbers as well, but this might not be feasible. Input: `Testing 123,456.789.Test` Ideal output: `['testing', '123,456.789', 'test']` Acceptable output: `['testing', '123', '456', '789', 'test']`
kj7rrv added the
enhancement
model-break
wait-for-break
labels 2022-11-26 18:21:46 -08:00
kj7rrv changed title from Numbers should be treated as words to Treat numbers as words 2022-12-24 12:25:21 -08:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: kj7rrv/gptc#12
No description provided.