Gengulphus wrote:If you're asking about the thread that lost its OP, yes, it's
http://boards.fool.co.uk/gdhyp-32nd-pur ... e#13285224. Looking at it again, it can be (and is in that link) loaded using the OP's 'mid': the potential minor trap is that if one tries to load the whole threads of posts using only the 'mid's of unremoved OPs, one will miss that thread. To get it right, one either needs to use the 'mid's of all OPs (and probably reject the ones where that produces an empty whole thread), or go through the 'mid's of all unremoved posts, tracking down the 'mid's of each one's OP, and then deal with the whole threads of all the OP 'mid's that one obtains that way.
If about the thread that needed re-archiving, sorry, no - and it's no longer an example anyway, as I have re-archived it!
Gengulphus
Thanks G - yes that was what I was referring to, sorry if I wasn't clear.
So (as I guess you knew already) the fool boards continue to reference the deleted 'OP', which add a little complication to what I was looking at, which is essentially a reference table, recording for each post :
Message ID
Board ID (with another table recording the board names and categories)
Author ID (with another table recording author names - AFAIK authorid is not a TMF field, so is created for this purpose)
Post number within the board.
Post date and time
Number of Recs
Message ID that this is a response to
Thread ID (Which I define as being the MID of the first post in the thread)
url of the post in the extended format like in your link above
It is only necessary to record the Subject title of each thread rather than each post.
So with this anomaly you've identified I would need to either (a) rewrite the data such that the first undeleted post in the thread is seen as the thread master or (b) record information as above for the deleted post. I think the latter option makes most sense, so I'll just need to locate each of these anomalies (relatively straight forwards).
So this leads me on to what might be a workable solution to the archiving (the hybrid option three I mentioned previously).
As I mentioned before I don't believe the wayback machine would be able to replicate all of the perm.s and com.s to enable all of the navigation links to work.
But using the data tables I outline above (which will come to less than a Gig.) it would be possible to recreate pretty much all of the expected navigation options in a separate tool and then deliver the resultant webpage from the wayback machine in an iframe. Like this :
http://campozo.net/tmfget/container.htmlAll that would be required would be for the wayback machine to archive
each post url in the format boards.fool.co.uk/gdhyp-32nd-purchase-top-up-13285077.aspx
and each thread in the format boards.fool.co.uk/gdhyp-32nd-purchase-top-up-13285077.aspx?sort=whole
Which is a finite and I believe manageable list of URLs, the first list being less than 7.8m records and the second list substantially less than that.
I'm not though what displaying someone else's webpage in an iframe implies about copyright or liability for the content. Does anyone have an thoughts on that?