@DATA-DOG:
Thanks! I think I have an idea for a new project. :)
MMM
is anyone aware if this project is already being done by someone?
if not, perhaps we can brainstorm how to go about this, now that the pdf for the revised nwt is out.
a couple initial impressions:.
@DATA-DOG:
Thanks! I think I have an idea for a new project. :)
MMM
is anyone aware if this project is already being done by someone?
if not, perhaps we can brainstorm how to go about this, now that the pdf for the revised nwt is out.
a couple initial impressions:.
All,
OK, I have completed all the work for the comprehensive NWT change list, and I am making the data available to anyone that wants it. First of all, I created a PDF with a mash of the NWTv1 (1984) and the NWTv2 (2013). It is the entire Bible, showing in-line the differences between each version. Here are a couple of images of what the PDF looks like:
Genesis:
Once the entire Bible was produced this way, I decided to go looking around. Take a look a the beginning of the book of Numbers. This book has a lot of changes because the NWTv2 replaces all spelled out numbers with actual digits. However, I noticed that at the beginning of Number 1:2, the first work "TAKE" was capitalized in the NWT 1984 version, but not in the current version. I thought perhaps I caught a parsing mistake from the program I used to gather the text from the WT website. But I went back and the word is actually capitalized. Also, I decided to check the NWT printed Bible. It too is capitalized. Does anyone know why the WT might capitalize "TAKE"? I know they capitalized "YOU" to indicate a plural (yall :))... but this?
For those of you who want to fuss with the data (I don't imagine there are a lot of you, but ok), my database has two very simple tables. One contains verse records with old and new text, along with marked up difference text, and finally a field that contains the "difference" log:
It also contains a "book" table with all the markup combined in a nifty format - this is what the final PDF was based off of... take it or leave it.
In any case, there you go. I don't want to place the links here because my database and the mash PDF is copyrighted material. But if you want to send me a PM, I'll supply the links. I imagine some of you might want the DB, some will want just the PDF... it's whatever you like. The PDF is about 12 MB and the database is about 114 MB.
Thanks,
MMM
The obvious answer, that everyone seems to overlook, is, of course, 42.
so gosh like at least 6 months ago i think i was perusing jw survey and saw rhe anonymous targeting the wt video and thought "sweet!!!
" ...so i googled it, and nothing came up but the actual you tube video and the thing on jw survey (i posted a comment on jw survey asking about it but nobody repied.... so since it was never spoken of again i guess it was a ho-ax?
just one dude make a fake vidoe and anonymous was never really after the wt?
Yeah, Mr Mustard, If may seem that if they are smart it will all be paperwork....
But then if you wanted to destroy a lot of paperwork quick....that aint easy. Hard disks however....
True, that is why I think the records - physical records - do not exist as a DB at all (filing cabinet or otherwise). They exists as paper documents dispersed among the congregations personal records. I don't think there is a centralized DB. If there is, I can't see it being on any network for download.
MMM
so gosh like at least 6 months ago i think i was perusing jw survey and saw rhe anonymous targeting the wt video and thought "sweet!!!
" ...so i googled it, and nothing came up but the actual you tube video and the thing on jw survey (i posted a comment on jw survey asking about it but nobody repied.... so since it was never spoken of again i guess it was a ho-ax?
just one dude make a fake vidoe and anonymous was never really after the wt?
I don't even think there is a pedophile DB to download. These records probably exist as physical files buried in a filing cabinet, behind locked doors, deep within the legal department.
MMM
is anyone aware if this project is already being done by someone?
if not, perhaps we can brainstorm how to go about this, now that the pdf for the revised nwt is out.
a couple initial impressions:.
@Apognophos:
Yep, the removal of brackets was actually mentioned as part of the talks at the AGM when the Bible was introduced. That output is probably good enough to make a PDF for those who want to read through it carefully and deliberatively (I guess we're looking at distributing by torrent after all).
I'm embarrassed to say this, but I am having serious second thoughts about whether I should still take the time to try to group and filter changes by kind. Unfortunately I am right in the middle of a super-important project that I don't think I should postpone. This has been the case for a while, but I didn't mention it because I was planning to put that work on hold while I worked on this NWT project, but this weekend my business partner moved up the timetable that my project is connected to, and now I don't know if I can afford to delay my work on it by even a week.
Just as importantly, it seems to me that a program that sorts changes by type will only be of limited use, as the importance of a change is not determined just by its nature , but by its position . For instance, an inserted comma could be incredibly important one time out of a hundred, and this important comma would be lost in the mix if I shunted all punctuation changes into their own group, because no one would have the patience to look through the Added/Removed Commas group to see if any were noteworthy. The same goes for other kinds of changes, whether it's dropping the progressive action tenses (mostly unimportant, but may occasionally be significant), or anything else. It might be better after all to just present all changes in one place, like MMM's last sample output above, and let the reader look up specific verses that are doctrinally significant and simply take in all the changes to those verses at once.
I am irritated with myself for starting this project and then managing to contribute absolutely nothing. It paints me as much more of a flake than I actually am. But the timing of the project simply is very poor for me, as it turns out. I might still come back to the idea of grouping changes when time permits, but I think that for now I need to focus on the other project, which is potentially life-changing for me, if things pan out and if I give it my all. I am also going to disconnect from the forum at large again, as it takes too much of my time. However, I will continue to actively check this thread and respond as warranted.
I'm very grateful that someone showed up who actually could do this work better than I could, plus had the time to do it! Thanks to MeanMrMustard for all his work so far. If anyone else wants to work on grouping and filtering changes, I would be glad to give advice, though I think the problem I outlined above is a significant one to overcome.
No worries! This has been a fun IT-action-packed thread. In the end, it allowed me to expand my skills a bit. I think I'll produce the PDF and let anyone have it that wants it. I'll also make the DB available to anyone as well. I know there aren't too many people that would want a copy, but its fine. I always considered coming up with a complete list of changes as the easy part. The hard part, as you have said, is taking that huge list of changes and coming up with a logical way to filter it down to the important changes. I am still open to suggestions on that.... :)
Sounds like a life-changing project should indeed take precedence! And you are right, posting on a forum can take up some time. I hope the project goes well for you.
MMM
is anyone aware if this project is already being done by someone?
if not, perhaps we can brainstorm how to go about this, now that the pdf for the revised nwt is out.
a couple initial impressions:.
@besty:
Interesting. Do you have a copy of the WT lib for download? I would be interested in any and all older versions too. I would love to see how they compare with one another (sneaking in changes).
MMM
so gosh like at least 6 months ago i think i was perusing jw survey and saw rhe anonymous targeting the wt video and thought "sweet!!!
" ...so i googled it, and nothing came up but the actual you tube video and the thing on jw survey (i posted a comment on jw survey asking about it but nobody repied.... so since it was never spoken of again i guess it was a ho-ax?
just one dude make a fake vidoe and anonymous was never really after the wt?
I don't even think Anon has a DB. I think they are full of it.
MMM
is anyone aware if this project is already being done by someone?
if not, perhaps we can brainstorm how to go about this, now that the pdf for the revised nwt is out.
a couple initial impressions:.
Apognophos,
No problem. The output from the google-diff-match-patch is fundamentally by character. In order to make it a word-level diff, you have to define what a word is (are spaces included? are they on the front or back?) and then encode the string - choose a Unicode character for each word and then allow the normal character-level diff operate on the encoded string. Then un-encode it to bring the diffs back to the word level. Very neat. The trick was in defining a word. As it turns out, the best results come from a word being defined as any stretch of non-whitespace characters. Also, each run of whitespace is a separate word, and any punctuation is also a 1-char word.
Here is what Genesis chapter 1 looks like. The base-line is the 1984 version. If you see run text with strikethrough, it means that this text was removed in the 2013 version. Green background text was not found in the 1984 version, but it was found in the 2013 version. You'll see that brackets are always eliminated. I've got the entire bible like this in my DB... I just need to dump it to a file. I was hoping to make a PDF, but had not gotten around to it.
Interesting that you can see the extent of the changes with just Genesis 1....
MMM