Railroad Magazine Index - Frequently Asked Questions

Frequently Asked Questions

How Do I Search?

If you just enter a word or phrase, it will search for any magazine page containing all of the terms, but not necessarily together. However, the underlying search engine provides a number of ways to do more advanced queries. Here's some advanced search query syntax hints:

Normal search terms are case insensitive, so searching for bcol would match BCOL, bcol, BCol, etc.
If you want to search specifically for a phrase, put the phrase in quotes. For example searching for "BC Rail" will only bring back pages with BC Rail as a phrase, rather than any page with BC and Rail in it somewhere.
Search terms will only match exactly, so if you search for M420, it won't match M420W or M420B. If you want to include all the suffixes, add an asterisk at the end.
More complex searches can be made using boolean terms AND, NOT, OR and parenthesis to group things together. Note that the logical operators must be capitalized. So if you want pages about BC Rail or British Columbia Railway M630Ws, you could search for: ("BC Rail" OR "British Columbia Railway") AND M630W
Using the "Advanced Search" link, you can limit the date range or the specific magazines to be searched. For example, if you were interested in the provincial sale of BC Rail to CN, you might only want to look in prototype magazines from 2001-2004.

Why Isn't Magazine XYZ Included?

Probably because we don't have digital copies of it to OCR and index, or just because we didn't think of it. Please see "How Do I Get a Magazine Added?"

How Do I Get a Magazine Added?

At the very least, we need a good quality scans of the each issue. That's the most time-consuming and tedious part of the process. 150dpi would be adequate, and 300dpi would be preferred. If they're already OCR'd, or generated from the original source files with a text layer, that's even better. If you've got those, or are willing to do the work to acquire and scan the issues, then please contact us and we'll discuss details.

If it's a magazine that's still actively published, we'd prefer you be the copyright holder or can send us to the appropriate contact to get permission. While we don't need it (see the question about copyright), it helps fend off lawyers if we have this up front, which makes our lives better.

Why Is Some of the Excerpt Text So Bad?

We're largely dependent upon optical character recognition to read old magazine scans. OCR has come a very long way, but it still makes a lot of mistakes, particularly on poor scans or with fonts that it just can't recognize. We don't have the time to go back and manually correct the text. However, if we get a newer, cleaner scan or a PDF with better OCR, we can regenerate the index and pick up the improved text.

Who Created/Maintains/Owns the Index?

For now, Michael Petersen and Nathan Holmes - both lifetime modelers, railfans, and railroad history enthusiasts. We created it, we maintain it, and we fund it. We also earn nothing from it. For the moment, we've decided to do it that way in order to maintain some neutrality and independence from any company or organization. In the future, we may consider limited sponsorship or donations to help offset the cost of running the server, depending on how popular it becomes.

Is It Sustainable?

In theory, yes. The whole system is written in fairly modern PHP and Python and can pretty much run on any modern Linux server. Everything we've built the site upon is open source, and we intend to open source the site code itself once we get things up and running (and can do some documentation). The one thing that won't be open is the actual database, since - in theory - an adequately motivated programmer could get the full article text back out of it. That would pose copyright issues.

The import of new magazines is very automated, requiring little input from us other than to run a few shell scripts. It shouldn't take us more than an hour a month to add in all the new issues. Adding a completely new magazine is a bit harder, but we're always interested in adding more content to the index.

In the event that we lose interest or are not able to sustain it, we'll assure an orderly transition of the database and software to a new owner.

What About Copyright?

For the most part, we have contacted the publishers or rights-holders of the magazines in the index and they've been supportive of these efforts. We've tried to link search results to the appropriate sources for the content, if a legitimate online source is available. This helps searchers get the content they're looking for, and it helps publishers increase the value of their digital properties at no cost to them.

The decision in Authors Guild v. HathiTrust holds that digitizing printed material for creating a text search engine is fair use. What we're doing clearly falls under that precedent, since it's transformative, non-commercial, does not diminish the value of the works indexed, and only provides a small snippet of the content in the results. That said, we have better things to do than play with a bag of lawyers. If a publisher contacts us and asks us not to include their content, we will respect that request.

It is impossible to get either the original PDFs, full size page images, or the full original text of the articles from the index. The first two are simply not on the servers to get. We don't store them there. The text is stored in an index table that would be exceedingly difficult to reverse engineer (because it's optimized for word search access) and even then, it's hidden in a backend service that's not exposed to the outside world. And no, just because you can't find a back issue on the open market, we can't send you the PDF.

Seriously, be nice. This is an all-volunteer effort by a couple guys in their spare time for the good of our hobby.