These datasets are generated using the scripts in the reber-grammar repo.
These datasets correspond to problems described in the article "Learning Sequential Structure with the Real-time Recurrent Learning Algorithm" by Smith and Zipser.
File | Grammar | Number of strings | Mean length | Standard deviation | Probability of loops |
---|---|---|---|---|---|
reber_train_2.4M.txt | simple | 2.4 millions | 8.0031 | 3.3683 | 0.5 |
reber_test_1M.txt | simple | 1 million | 7.9984 | 3.3671 | 0.5 |
symmetrical_reber_train_2.4M.txt | symmetrical | 2.4 millions | 10.0031 | 3.3671 | 0.5 |
symmetrical_reber_test_1M.txt | symmetrical | 1 million | 10.0017 | 3.3677 | 0.5 |
symmetrical_reber_loop0.3_train_2.4M.txt | symmetrical | 2.4 millions | 9.1412 | 2.6391 | 0.3 |
symmetrical_reber_loop0.3_test_1M.txt | symmetrical | 1 million | 9.1442 | 2.6423 | 0.3 |