Project update – converting my Where I Read archive.
Working in Data Warehousing for the past 9 months has, if nothing else, given me a good idea of the scope and issues of my current project.
Right now, I’ve got about 10 megabytes of threads in printable form from RPG.net. I’ve written a neat little utility that scrapes those threads for posts by me. First-pass analysis says I’ve got a little over 500 posts there.
Now, I can pretty easily scrape the data, extract my posts, auto-format ’em into the pseudo-HTML WordPress accepts, and paste them in one by one. My challenge is going to be doing it 500+ times (or less, since a good number of those posts are discussion, and I’d prefer not to reproduce other people’s words if I can help it). I’ll be losing out on the discussion and great posts from other posters, but I also plan on including links to the original threads for each book.
Another challenge is going to be working out a date format. One simple option would be to just backdate the first post to some appropriately-historic date, post each following post on the next day, and when I’m finally caught up, put in some links and resume the WIR.
If anyone knows a good way to bulk-load entries into WordPress, I’d love to hear it.