I'm looking through some of the archived pages right now. One problem that I see is that none of them have the author listed, so I'll probably just assign a generic author name to each entry file. I'll also have to do the same for the hour, minute, second variables.
For comments, the author information is there and so is most of the date info.
I also found the code for the php script that converts greymatter entries into wordpress, which will help as a guide to make sure the cgi files I make be converted correctly.
Currently I'm looking into how I'm going to break down the html to extract all of the information needed to build an entry file.
EDIT:
Clark:
Actually, is it necessary to save the comments to each news post? Parsing each of the comments out looks like it'll be the biggest hurdle in this project. I can probably do it if you really want them saved , but things will be simplified a great deal if I can leave them out. Beyond that, I'm ready to start coding.