This ia follow-up to my attempt to look at the relative difficulty of the New Testament books. I began to wonder what the best reading order for the NT books would be, if you wanted to read a whole book at a time, and learn as few new words as possible each time. This turned out not to be a very insightful question to ask, as the rest of this page demonstrates. But I'm glad that I followed up on my idea at least. :-) Here it is in excruciating detail...
Number of occurrences to words, in descending order of frequency:
SELECT COUNT(lemma) AS cnt FROM sblgnt GROUP BY lemma ORDER BY cnt DESC;
Not a lot of surprises here:
Here it is with a logarithmic scale:
So you might want to ask, how many lemmas occur more than 10x?
SELECT COUNT(lemma) FROM (SELECT lemma,COUNT(lemma) AS cnt FROM sblgnt GROUP BY lemma) AS tmp WHERE cnt >= 10;
I started wondering what the best order to read the books would be, if your priority was to read complete books, and learn as little vocabulary as possible at one go. (That is, once you've read the first book and know all of those words, you don't need to relearn them to read the second book.) I couldn't think of a way to figure this out with a simple SQL query. I had to create an algorithm.
# this is a temporary copy of sblgnt
CREATE TEMPORARY TABLE words ( _id INTEGER PRIMARY KEY AUTOINCREMENT, lemma TEXT, book_name TEXT );
INSERT INTO words (lemma,book_name) SELECT lemma,book_name FROM sblgnt;
# another table to hold used words
CREATE TEMPORARY TABLE used_words ( _id INTEGER PRIMARY KEY AUTOINCREMENT, lemma TEXT, book_name TEXT );
do this 27 times{
# take the book_name with the fewest words:
SELECT COUNT(DISTINCT lemma) AS cnt,book_name FROM words WHERE lemma NOT IN (SELECT lemma FROM used_words) GROUP BY book_name ORDER BY cnt ASC LIMIT 1;
# put those words into used_words
INSERT INTO used_words (lemma,book_name) SELECT lemma,book_name FROM sblgnt WHERE book_name=?;
# remove them from words
DELETE FROM words WHERE book_name=?;
}
This gives the following order. The number shows the number of new words you have to learn to read that book.
At first it's surprising that you have to learn fewer words for 3 John than for 2 John, but that's because you already learned a bunch of words for 2 John (95, in fact). 3 John only has 108 words to learn, but you'd have learned a good deal from having studied 2 John first.
This wasn't as interesting as I'd hoped it would be, because the measure is strongly biased by the length of the book. There's a comparison below between this order, and simply looking at how many distinct lemmas there are in each book. You don't get a dramatically different answer than you would if you just started with the books with the fewest words. (And really, who's going to wait to the end to read the gospels?)
Sort by unique lemmas | The “ideal” order | |||
---|---|---|---|---|
95 | 2 John | 2 John | 95 | |
108 | 3 John | 3 John | 56 | |
140 | Philemon | Philemon | 80 | |
226 | Jude | 1 John | 134 | |
233 | 1 John | 2 Thessalonians | 126 | |
249 | 2 Thessalonians | Jude | 127 | |
300 | Titus | Titus | 150 | |
361 | 1 Thessalonians | 1 Thessalonians | 147 | |
399 | 2 Peter | 2 Peter | 171 | |
430 | Colossians | Colossians | 163 | |
442 | Philippians | Philippians | 161 | |
452 | 2 Timothy | Ephesians | 155 | |
519 | Galatians | 2 Timothy | 157 | |
528 | Ephesians | Galatians | 157 | |
538 | 1 Timothy | 1 Timothy | 170 | |
543 | 1 Peter | 1 Peter | 163 | |
555 | James | James | 173 | |
786 | 2 Corinthians | 2 Corinthians | 211 | |
911 | Revelation | 1 Corinthians | 242 | |
952 | 1 Corinthians | Romans | 227 | |
999 | John | Hebrews | 288 | |
1029 | Hebrews | Revelation | 272 | |
1054 | Romans | John | 241 | |
1341 | Mark | Mark | 327 | |
1680 | Matthew | Matthew | 287 | |
2032 | Acts | Luke | 407 | |
2046 | Luke | Acts | 574 |
A somewhat more pedagogically relevant question would be, “Suppose someone already knew all the words that occur 25 times or more. Then what would the ideal reading order be?” To answer this, I ran this command before I ran the ‘script’ above.
DELETE FROM words WHERE lemma IN (SELECT lemma FROM (SELECT lemma,COUNT(lemma) AS cnt FROM sblgnt GROUP BY lemma) AS tmp WHERE cnt >= 25);
Unfortunately that doesn't change much:
The “ideal” order | The “ideal” order, setting aside >25x words | |||
---|---|---|---|---|
2 John | 95 | 2 John | 11 | |
3 John | 56 | 3 John | 23 | |
Philemon | 80 | Philemon | 45 | |
1 John | 134 | 1 John | 53 | |
2 Thessalonians | 126 | 2 Thessalonians | 80 | |
Jude | 127 | Jude | 87 | |
Titus | 150 | 1 Thessalonians | 116 | |
1 Thessalonians | 147 | Titus | 127 | |
2 Peter | 171 | Colossians | 150 | |
Colossians | 163 | 2 Peter | 146 | |
Philippians | 161 | Philippians | 142 | |
Ephesians | 155 | Ephesians | 140 | |
2 Timothy | 157 | 2 Timothy | 147 | |
Galatians | 157 | Galatians | 144 | |
1 Timothy | 170 | 1 Timothy | 162 | |
1 Peter | 163 | 1 Peter | 154 | |
James | 173 | James | 162 | |
2 Corinthians | 211 | 2 Corinthians | 203 | |
1 Corinthians | 242 | 1 Corinthians | 232 | |
Romans | 227 | Romans | 225 | |
Hebrews | 288 | Hebrews | 281 | |
Revelation | 272 | Revelation | 264 | |
John | 241 | John | 233 | |
Mark | 327 | Mark | 325 | |
Matthew | 287 | Matthew | 286 | |
Luke | 407 | Luke | 407 | |
Acts | 574 | Acts | 574 |
(Can you tell I'm writing this as I go along?)
Or you could ask, “Suppose someone doesn't care about learning a word unless it comes up at least five times. Then what would the ideal reading order be?” To answer this, I ran this command before I ran the ‘script’ above.
DELETE FROM words WHERE lemma NOT IN (SELECT lemma FROM (SELECT lemma,COUNT(lemma) AS cnt FROM sblgnt GROUP BY lemma) AS tmp WHERE cnt >= 5);
But this actually breaks my algorithm, because using this method you get some “freebie” books (i.e., no new vocabulary), which my algorithm does not expect. So, it must be revised:
# this is a temporary copy of sblgnt
CREATE TEMPORARY TABLE words ( _id INTEGER PRIMARY KEY AUTOINCREMENT, lemma TEXT, book_name TEXT );
INSERT INTO words (lemma,book_name) SELECT lemma,book_name FROM sblgnt;
DELETE FROM words WHERE lemma NOT IN (SELECT lemma FROM (SELECT lemma,COUNT(lemma) AS cnt FROM sblgnt GROUP BY lemma) AS tmp WHERE cnt >= 5);
# another table to hold used words
CREATE TEMPORARY TABLE used_words ( _id INTEGER PRIMARY KEY AUTOINCREMENT, lemma TEXT, book_name TEXT );
# a table to hold book names
CREATE TEMPORARY TABLE books ( book_name TEXT );
INSERT INTO books SELECT name FROM book_names;
do this 27 times{
# take the book_name with the fewest words:
SELECT
IFNULL(cnt,0),books.book_name FROM
books
LEFT JOIN
( SELECT COUNT(DISTINCT lemma) AS cnt,book_name FROM words WHERE lemma NOT IN (SELECT lemma FROM used_words) GROUP BY book_name ) AS tmp
ON books.book_name = tmp.book_name
ORDER BY cnt ASC LIMIT 1;
# put those words into used_words
INSERT INTO used_words (lemma,book_name) SELECT lemma,book_name FROM sblgnt WHERE book_name=?;
# remove them from words
DELETE FROM words WHERE book_name=?;
# remove it from the temporary book name table
DELETE FROM books WHERE book_name=?;
}
Prepare yourself for another anticlimax:
The “ideal” order | The “ideal” order, setting aside <5x words | |||
---|---|---|---|---|
2 John | 95 | 2 John | 92 | |
3 John | 56 | 3 John | 46 | |
Philemon | 80 | Philemon | 59 | |
1 John | 134 | Jude | 116 | |
2 Thessalonians | 126 | 1 John | 100 | |
Jude | 127 | 2 Thessalonians | 86 | |
Titus | 150 | Titus | 78 | |
1 Thessalonians | 147 | 2 Peter | 90 | |
2 Peter | 171 | 1 Thessalonians | 87 | |
Colossians | 163 | 2 Timothy | 81 | |
Philippians | 161 | 1 Timothy | 74 | |
Ephesians | 155 | Colossians | 68 | |
2 Timothy | 157 | Philippians | 63 | |
Galatians | 157 | Ephesians | 52 | |
1 Timothy | 170 | 1 Peter | 61 | |
1 Peter | 163 | Galatians | 59 | |
James | 173 | James | 67 | |
2 Corinthians | 211 | 2 Corinthians | 63 | |
1 Corinthians | 242 | Romans | 63 | |
Romans | 227 | 1 Corinthians | 59 | |
Hebrews | 288 | Hebrews | 57 | |
Revelation | 272 | Revelation | 92 | |
John | 241 | John | 90 | |
Mark | 327 | Mark | 86 | |
Matthew | 287 | Matthew | 35 | |
Luke | 407 | Luke | 13 | |
Acts | 574 | Acts | 21 |
So even setting aside all of the infrequent words in the gospels and Acts, you'd still be best off learning them last.
All contents © 2024 Adam Baker, except where otherwise noted.