Project update – converting my Where I Read archive.

Working in Data Warehousing for the past 9 months has, if nothing else, given me a good idea of the scope and issues of my current project.

Right now, I’ve got about 10 megabytes of threads in printable form from I’ve written a neat little utility that scrapes those threads for posts by me. First-pass analysis says I’ve got a little over 500 posts there.

Now, I can pretty easily scrape the data, extract my posts, auto-format ’em into the pseudo-HTML WordPress accepts, and paste them in one by one. My challenge is going to be doing it 500+ times (or less, since a good number of those posts are discussion, and I’d prefer not to reproduce other people’s words if I can help it). I’ll be losing out on the discussion and great posts from other posters, but I also plan on including links to the original threads for each book.

Another challenge is going to be working out a date format. One simple option would be to just backdate the first post to some appropriately-historic date, post each following post on the next day, and when I’m finally caught up, put in some links and resume the WIR.

If anyone knows a good way to bulk-load entries into WordPress, I’d love to hear it.


2 thoughts on "Project update – converting my Where I Read archive.

  1. Leaving aside that 500 posts at one post a day would be 2 years (God, has it been that long?), wouldn’t…. Well, no, not leaving aside. Focusing quite clearly on that point! Two years!

    I’d almost suggest for the old threads, just archive them in some sort of document format on… I dunno, Google Docs or something, and post a single post linking to them, and maybe back to the original threads. There’s not really point in leaving them as a “living” document, is there?


    • That is a very good point. Honestly, I don’t have any real objection to having the public archives be the threads themselves, since they include all of the discussion and back-and-forth.

      Maybe I can just post book summaries for the books up through Magic’s Pawn, repost the entries for Magic’s Price, and start from there.

      I will consider this.


