Donate to Remove ads

Got a credit card? use our Credit Card & Finance Calculators

Thanks to Anonymous,bruncher,niord,gvonge,Shelford, for Donating to support the site

Archiving TMF boards

Formerly "Lemon Fool - Improve the Recipe" repurposed as Room 102 (see above).
Gromley
Posts: 33
Joined: November 4th, 2016, 5:53 pm
Has thanked: 3 times
Been thanked: 1 time

Re: Archiving TMF boards

#13511

Postby Gromley » December 8th, 2016, 10:33 pm

Clariman wrote:Hi Gromley. The technical aspects may be the easiest part of it. When Stooz and I looked at this with two other Fools, in discussion with TMF, the costs were into 5 figures if I recall. You address some of them but not the fact that TMF own the data and the copyright. To legitimately copy it there may be licensing and services costs involved, in addition to insurances.

Clariman


Many thanks Clariman, really important to understand where the pinch-points are; I also fully understand stooz's (I think) post stressing that your focus is on the future not the past - Absolutely!

On (actaully 2) question though when you say :
To legitimately copy it there may be licensing and services costs involved, in addition to insurances.


I thought I had read back that Tarantula had said , they would be happy to grant rights subject to recognition of copyright? So I was just wondering if you had any breakdown of those costs? Or in absence of that, and fully understanding that this is not your priority, are you aware if anyone else is currently having these discussions with TMF? It's certainly something I'd be prepared to take on.

Grateful for any thoughts on that, but meanwhile - in case it's not obvious - much appreciation for what you & stooz have done here!

mc2fool
Lemon Half
Posts: 8088
Joined: November 4th, 2016, 11:24 am
Has thanked: 7 times
Been thanked: 3125 times

Re: Archiving TMF boards

#13520

Postby mc2fool » December 9th, 2016, 12:16 am

Clariman wrote:Hi Gromley. The technical aspects may be the easiest part of it. When Stooz and I looked at this with two other Fools, in discussion with TMF, the costs were into 5 figures if I recall. You address some of them but not the fact that TMF own the data and the copyright. To legitimately copy it there may be licensing and services costs involved, in addition to insurances.

I think you've got this exactly the wrong way round. The Wayback machine already has huge amounts of the world wide web -- including large chunks of the TMF boards -- on it and they don't seem to have any licensing, insurance or copyright issues. So, let them store the archive. That is their raison d'être after all.

The technical aspects are the issue, in particular, as Gromley has identified, the addressing. I think the question (which, since my OP, I still haven't found the time to research and think through :oops: ), is not whether all of the variants can be saved in the Wayback machine but if there is a useful enough subset that can be -- if it's very useful enough then just one method will do! ;)

Gromley: the Wayback machines does store and act on 301 redirects. E.g. http://web.archive.org/web/20161209000547/http://boards.fool.co.uk/Message.aspx?mid=5667966. How useful that is, I haven't found the time to research and think through...

Oh, on a quick google, What is the Wayback Machine's Copyright Policy?

PinkDalek
Lemon Half
Posts: 6139
Joined: November 4th, 2016, 1:12 pm
Has thanked: 1589 times
Been thanked: 1801 times

Re: Archiving TMF boards

#13522

Postby PinkDalek » December 9th, 2016, 12:35 am

mc2fool wrote:...

Oh, on a quick google, What is the Wayback Machine's Copyright Policy?


That includes "Identification of the copyrighted work that you claim has been infringed" which is interesting in itself once the TMF boards are no longer read only. If I felt the copyright of one of my posts back over at TMF had been infringed and I was concerned about it, I'd find it impossible to find my original. Other than over at the waybackmachine, if archived!

Anyway, more seriously, I thought the posters own the copyright (albeit they've granted TMF an irrevocable licence to reproduce etc***) so I don't really follow the "fact that TMF own ... the copyright" part.

*** http://www.fool.co.uk/help/terms-and-co ... ourcontent

ReformedCharacter
Lemon Quarter
Posts: 3169
Joined: November 4th, 2016, 11:12 am
Has thanked: 3734 times
Been thanked: 1539 times

Re: Archiving TMF boards

#13524

Postby ReformedCharacter » December 9th, 2016, 12:39 am

You address some of them but not the fact that TMF own the data and the copyright. To legitimately copy it there may be licensing and services costs involved, in addition to insurances.


TMF don't own the data (if I assume that to be board posts) they grant themselves a perpetual non-exclusive license to the copyright.

see also: https://www.gov.uk/guidance/exceptions-to-copyright

particularly under 'Fair Dealing'. I scraped all the posts from the TMF boards of interest to me. I don't believe I have infringed anyone's copyright and nor would I wish to.

RC

Clariman
Lemon Quarter
Posts: 3288
Joined: November 4th, 2016, 12:17 am
Has thanked: 3134 times
Been thanked: 1566 times

Re: Archiving TMF boards

#13544

Postby Clariman » December 9th, 2016, 8:07 am

Mc2Fool - My comment about the copyright/licensing/insurance issues being more significant than the technical ones, is true. Having worked in IT-related roles for nearly 35 years, and as a former techie myself, my experience is that great technicians tend to see every challenge as a technical one and get excited about solving it. However, they frequently ignore or don't even see the more important non-technical issues. I have no doubt that a half-decent technical solution to a TMF archive could be created.

All - yes the individual posters own the copyright of their posts on TMF, but granted TMF royalty-free and perpetual use. So, yes any individual user can extract their own posts. That is a completely different matter to another website downloading and using the content of every single user who ever posted on TMF, without their individual permissions. One option would be for TMF to formerly sub-license it to others (their Ts & Cs make that possible), but there would need to be a formal and legal structure in place for the sub-licensing agreement.

If the wayback machine archives it, I have no problem with that. That is up to them. Note also that Wayback make it explicit that its use is non-commercial i.e. it is purely for archival purposes.

Clariman

mc2fool
Lemon Half
Posts: 8088
Joined: November 4th, 2016, 11:24 am
Has thanked: 7 times
Been thanked: 3125 times

Re: Archiving TMF boards

#13555

Postby mc2fool » December 9th, 2016, 8:49 am

Clariman wrote:Mc2Fool - My comment about the copyright/licensing/insurance issues being more significant than the technical ones, is true.

Not in the case of getting everything into the Wayback Machine, which is what I was talking about.

Gengulphus
Lemon Quarter
Posts: 4255
Joined: November 4th, 2016, 1:17 am
Been thanked: 2631 times

Re: Archiving TMF boards

#13719

Postby Gengulphus » December 9th, 2016, 4:23 pm

Clariman wrote:... but not the fact that TMF own the data and the copyright. ...


Not entirely. They don't own the copyright to the user-supplied content - the user who wrote it does, and they merely own an irrevocable license to use it. But they do own the copyright to what they've added to the user-supplied content - e.g. their headers, the various links e.g. to next and previous posts, the rec counts, etc - and the license does include the right to sub-license. The net result is that I believe they can license the whole lot - but they cannot sell the copyright to the whole lot.

The approach I'm taking on my considerably more limited project to make certain (or as certain as reasonably possible) that stuff about my demo HYPs is preserved is akin to your third, "hybrid" approach: I write the material that will appear directly; I use the Wayback Machine to make certain anything I link to has been archived.

Incidentally, a couple of minor traps I've observed while doing that: I think there's been a suggestion that archiving the "whole thread" view of each OP rather than the "single post" view of each post might be a good idea. The first trap is that a thread may no longer have an OP - for example, one of my OPs was seriously misformatted. I reformatted, reposted as a reply and asked the moderators to delete the OP when this was pointed out (the misformatting was causing "whole thread" view to be displayed too wide, so it needed to go to solve that problem). The moderators did so and so the thread no longer has an OP.

The second is that it can happen that a thread has been added to since the last time it was archived as a whole thread - in which case you get an incomplete version of it when you ask for it. You can force it to be re-archived, but obviously you need to watch out for it happening to know when to force re-archiving... It was only when I found something missing from an archived thread about GDHYP that I realised this was happening.

Gengulphus

Gromley
Posts: 33
Joined: November 4th, 2016, 5:53 pm
Has thanked: 3 times
Been thanked: 1 time

Re: Archiving TMF boards

#13800

Postby Gromley » December 9th, 2016, 7:20 pm

Gengulphus wrote:
The approach I'm taking on my considerably more limited project to make certain (or as certain as reasonably possible) that stuff about my demo HYPs is preserved is akin to your third, "hybrid" approach: I write the material that will appear directly; I use the Wayback Machine to make certain anything I link to has been archived.

Incidentally, a couple of minor traps I've observed while doing that: I think there's been a suggestion that archiving the "whole thread" view of each OP rather than the "single post" view of each post might be a good idea. The first trap is that a thread may no longer have an OP - for example, one of my OPs was seriously misformatted. I reformatted, reposted as a reply and asked the moderators to delete the OP when this was pointed out (the misformatting was causing "whole thread" view to be displayed too wide, so it needed to go to solve that problem). The moderators did so and so the thread no longer has an OP.

The second is that it can happen that a thread has been added to since the last time it was archived as a whole thread - in which case you get an incomplete version of it when you ask for it. You can force it to be re-archived, but obviously you need to watch out for it happening to know when to force re-archiving... It was only when I found something missing from an archived thread about GDHYP that I realised this was happening.

Gengulphus


Interesting point about threads - do have a link or 'mid' to the thread in question? Would be useful to see how this is rendered

Gengulphus
Lemon Quarter
Posts: 4255
Joined: November 4th, 2016, 1:17 am
Been thanked: 2631 times

Re: Archiving TMF boards

#13870

Postby Gengulphus » December 10th, 2016, 3:53 am

Gromley wrote:
Gengulphus wrote:Incidentally, a couple of minor traps I've observed while doing that: I think there's been a suggestion that archiving the "whole thread" view of each OP rather than the "single post" view of each post might be a good idea. The first trap is that a thread may no longer have an OP - for example, one of my OPs was seriously misformatted. I reformatted, reposted as a reply and asked the moderators to delete the OP when this was pointed out (the misformatting was causing "whole thread" view to be displayed too wide, so it needed to go to solve that problem). The moderators did so and so the thread no longer has an OP.

The second is that it can happen that a thread has been added to since the last time it was archived as a whole thread - in which case you get an incomplete version of it when you ask for it. You can force it to be re-archived, but obviously you need to watch out for it happening to know when to force re-archiving... It was only when I found something missing from an archived thread about GDHYP that I realised this was happening.


Interesting point about threads - do have a link or 'mid' to the thread in question? Would be useful to see how this is rendered


If you're asking about the thread that lost its OP, yes, it's http://boards.fool.co.uk/gdhyp-32nd-purchase-top-up-13285077.aspx?sort=whole#13285224. Looking at it again, it can be (and is in that link) loaded using the OP's 'mid': the potential minor trap is that if one tries to load the whole threads of posts using only the 'mid's of unremoved OPs, one will miss that thread. To get it right, one either needs to use the 'mid's of all OPs (and probably reject the ones where that produces an empty whole thread), or go through the 'mid's of all unremoved posts, tracking down the 'mid's of each one's OP, and then deal with the whole threads of all the OP 'mid's that one obtains that way.

If about the thread that needed re-archiving, sorry, no - and it's no longer an example anyway, as I have re-archived it!

Gengulphus

Gromley
Posts: 33
Joined: November 4th, 2016, 5:53 pm
Has thanked: 3 times
Been thanked: 1 time

Re: Archiving TMF boards

#13916

Postby Gromley » December 10th, 2016, 12:25 pm

Gengulphus wrote:
If you're asking about the thread that lost its OP, yes, it's http://boards.fool.co.uk/gdhyp-32nd-pur ... e#13285224. Looking at it again, it can be (and is in that link) loaded using the OP's 'mid': the potential minor trap is that if one tries to load the whole threads of posts using only the 'mid's of unremoved OPs, one will miss that thread. To get it right, one either needs to use the 'mid's of all OPs (and probably reject the ones where that produces an empty whole thread), or go through the 'mid's of all unremoved posts, tracking down the 'mid's of each one's OP, and then deal with the whole threads of all the OP 'mid's that one obtains that way.

If about the thread that needed re-archiving, sorry, no - and it's no longer an example anyway, as I have re-archived it!

Gengulphus


Thanks G - yes that was what I was referring to, sorry if I wasn't clear.

So (as I guess you knew already) the fool boards continue to reference the deleted 'OP', which add a little complication to what I was looking at, which is essentially a reference table, recording for each post :

Message ID
Board ID (with another table recording the board names and categories)
Author ID (with another table recording author names - AFAIK authorid is not a TMF field, so is created for this purpose)
Post number within the board.
Post date and time
Number of Recs
Message ID that this is a response to
Thread ID (Which I define as being the MID of the first post in the thread)
url of the post in the extended format like in your link above


It is only necessary to record the Subject title of each thread rather than each post.

So with this anomaly you've identified I would need to either (a) rewrite the data such that the first undeleted post in the thread is seen as the thread master or (b) record information as above for the deleted post. I think the latter option makes most sense, so I'll just need to locate each of these anomalies (relatively straight forwards).

So this leads me on to what might be a workable solution to the archiving (the hybrid option three I mentioned previously).

As I mentioned before I don't believe the wayback machine would be able to replicate all of the perm.s and com.s to enable all of the navigation links to work.

But using the data tables I outline above (which will come to less than a Gig.) it would be possible to recreate pretty much all of the expected navigation options in a separate tool and then deliver the resultant webpage from the wayback machine in an iframe. Like this : http://campozo.net/tmfget/container.html

All that would be required would be for the wayback machine to archive

each post url in the format boards.fool.co.uk/gdhyp-32nd-purchase-top-up-13285077.aspx
and each thread in the format boards.fool.co.uk/gdhyp-32nd-purchase-top-up-13285077.aspx?sort=whole

Which is a finite and I believe manageable list of URLs, the first list being less than 7.8m records and the second list substantially less than that.

I'm not though what displaying someone else's webpage in an iframe implies about copyright or liability for the content. Does anyone have an thoughts on that?

Gengulphus
Lemon Quarter
Posts: 4255
Joined: November 4th, 2016, 1:17 am
Been thanked: 2631 times

Re: Archiving TMF boards

#13990

Postby Gengulphus » December 10th, 2016, 5:58 pm

Gromley wrote:
Gengulphus wrote:
If you're asking about the thread that lost its OP, yes, it's http://boards.fool.co.uk/gdhyp-32nd-pur ... e#13285224. Looking at it again, it can be (and is in that link) loaded using the OP's 'mid': the potential minor trap is that if one tries to load the whole threads of posts using only the 'mid's of unremoved OPs, one will miss that thread. To get it right, one either needs to use the 'mid's of all OPs (and probably reject the ones where that produces an empty whole thread), or go through the 'mid's of all unremoved posts, tracking down the 'mid's of each one's OP, and then deal with the whole threads of all the OP 'mid's that one obtains that way.

If about the thread that needed re-archiving, sorry, no - and it's no longer an example anyway, as I have re-archived it!

Gengulphus


Thanks G - yes that was what I was referring to, sorry if I wasn't clear.


No problem - I was 95% certain I'd read your question correctly - it's just that as you'd quoted the bit about the thread that needed re-archiving, I wondered whether I was missing something!

Gengulphus

odysseus2000
Lemon Half
Posts: 6545
Joined: November 8th, 2016, 11:33 pm
Has thanked: 1580 times
Been thanked: 993 times

Re: Archiving TMF boards

#25099

Postby odysseus2000 » January 21st, 2017, 11:56 pm

Re: Archiving TMF boards
by odysseus2000 » January 21st, 2017, 11:52 pm

I was wondering if someone could summarise what has happened with archiving. I note the prior post was on the 10th of December, so is everyone happily archived or as the discussion moved elsewhere?

Did all of tmf get archived on the wayback machine? I have had a go at trying to use it, software/hardware as below, and didn't get very far which could again be me but I wondered if there were known issues that I am ignorant of.

On another specific question I tried to US Random Amblers download tool with a MacBook Air and Safari and nothing happened. Is this a known issue or indicative that I have done something wrong. I followed the instructions and got a url with mid and included the &sort=username, but no download, no error. Probably me, but would appreciate any help.

Kind regards,

mc2fool
Lemon Half
Posts: 8088
Joined: November 4th, 2016, 11:24 am
Has thanked: 7 times
Been thanked: 3125 times

Re: Archiving TMF boards

#25111

Postby mc2fool » January 22nd, 2017, 2:05 am

odysseus2000 wrote:I was wondering if someone could summarise what has happened with archiving.

Yeah, sure, that's easy. Nothing.

A (very) few folks have archived on the wayback machine a (very) few TMF threads that they've referred and linked to from here, but otherwise nothing has been done.

odysseus2000
Lemon Half
Posts: 6545
Joined: November 8th, 2016, 11:33 pm
Has thanked: 1580 times
Been thanked: 993 times

Re: Archiving TMF boards

#25126

Postby odysseus2000 » January 22nd, 2017, 8:22 am

.

mc2fool

Yeah, sure, that's easy. Nothing.

A (very) few folks have archived on the wayback machine a (very) few TMF threads that they've referred and linked to from here, but otherwise nothing has been done.


Kind of sad, so much effort & emotion went into creating the message board at the UK Fool which now seem destined to vanish from the collective history of the beginning of the Internet age.

Thanks for the update of the unhappy state of play.

Regards,

Gengulphus
Lemon Quarter
Posts: 4255
Joined: November 4th, 2016, 1:17 am
Been thanked: 2631 times

Re: Archiving TMF boards

#25252

Postby Gengulphus » January 22nd, 2017, 8:35 pm

odysseus2000 wrote:
mc2fool wrote:Yeah, sure, that's easy. Nothing.

A (very) few folks have archived on the wayback machine a (very) few TMF threads that they've referred and linked to from here, but otherwise nothing has been done.

Kind of sad, so much effort & emotion went into creating the message board at the UK Fool which now seem destined to vanish from the collective history of the beginning of the Internet age.

Well, there is something people can do about it. Rescuing the entire TMF site from that fate by archiving them on the Wayback Machine (or elsewhere) is a mammoth task, one that it has become clear is too big and (at least for some options) too fraught with legal and financial difficulties for anyone to take on. And even rescuing entire boards from it is still a major task and not to be undertaken lightly, at least for the more popular boards.

But anyone who thinks that valuable stuff is disappearing does have the option of picking out stuff they feel is particularly valuable, archiving it on the Wayback Machine, keeping a record of the Wayback Machine links and posting them (or otherwise making them available). I've been doing all but the last stage of that for stuff about my demo HYPs, for example - which means that I've archived some hundreds of threads and have a record of all the links. I haven't actually done the last stage of posting them yet, basically because I haven't yet worked out how best to make such a large amount of material available and usable, and because there's no time pressure on doing that while there is on archiving stuff and recording the links. So the fact that it's been archived isn't particularly visible yet - but it has been.

Some hundreds of threads is of course a very small proportion of the threads on the TMF boards, but I'd call it more than "a (very) few TMF threads"! And there may well be others who have done something similar - and there's still time for anyone else who particularly values some TMF board material to do likewise.

The one thing I will say is don't try to organise some grand scheme for rescuing as much as possible, avoiding duplication of effort - that's because we could very easily end up using all or most of the remaining time working out how to organise such a scheme and little or none of it actually doing the archiving! Just get on with archiving whatever you think is most valuable: if enough people do that, a good proportion of the most valuable stuff should get archived. It will still be a small percentage of all the TMF threads - e.g. ten thousand threads is a plausibly achievable amount and is probably under 1% of all the TMF threads. But it should be a considerably higher percentage of the value, on some sort of consensus view of what is valuable.

Gengulphus

odysseus2000
Lemon Half
Posts: 6545
Joined: November 8th, 2016, 11:33 pm
Has thanked: 1580 times
Been thanked: 993 times

Re: Archiving TMF boards

#25257

Postby odysseus2000 » January 22nd, 2017, 9:11 pm

Gengulphus
The one thing I will say is don't try to organise some grand scheme for rescuing as much as possible, avoiding duplication of effort - that's because we could very easily end up using all or most of the remaining time working out how to organise such a scheme and little or none of it actually doing the archiving! Just get on with archiving whatever you think is most valuable: if enough people do that, a good proportion of the most valuable stuff should get archived. It will still be a small percentage of all the TMF threads - e.g. ten thousand threads is a plausibly achievable amount and is probably under 1% of all the TMF threads. But it should be a considerably higher percentage of the value, on some sort of consensus view of what is valuable.


Excellent points.

I will have a do with some of the stuff I find interesting.

I imagine that some future historian will find the posts from the early years of the Internet Age of value. Perhaps not of the standard of Pepys Diaries, but none the less a remarkable source documenting how folk thought and acted as this huge change of the Internet Age rolled out.

Regards,

Gengulphus
Lemon Quarter
Posts: 4255
Joined: November 4th, 2016, 1:17 am
Been thanked: 2631 times

Re: Archiving TMF boards

#25262

Postby Gengulphus » January 22nd, 2017, 10:17 pm

Snorvey wrote:Didn't someone say it was something like 8gb of posting data on old Fool?

Can't someone just send them a couple of blank DVD's and ask them to backup the boards to them until we figure out what to do?

I can't say for certain, but Clariman's post of December 8th at the bottom of page 2 of this thread indicates that people have negotiated with TMF about the general issue, and ran up against an issue of 5-figure costs. (Mainly licensing, services and insurance costs, by the way - not the costs of DVDs and copying the data on to them.)

Now it may be that the options they explored included TMF just taking a backup and keeping it available until something else can be sorted out, or it may be that they didn't include that. And if they didn't include that, it's possible that if that option were explored, it wouldn't have the same costs - and it's also possible that it would: it depends on what the terms and conditions are for whatever licenses, etc, are involved. Those are the reasons why I can't say for certain that the TMF-takes-a-backup-and-keeps-it-available solution has much greater costs than the obvious ones of a few DVDs and copying the data on to them - only that I suspect it might well do. And if you want a more definitive answer than that, I suspect the best way to get it is to ask the question of TMF!

One other point is that if the TMF-takes-a-backup-and-keeps-it-available solution does have just the obvious costs of a few DVDs and copying the data on to them, I would be absolutely amazed if they weren't taking such a backup for themselves anyway...

Gengulphus

melonfool
Lemon Quarter
Posts: 2939
Joined: November 4th, 2016, 11:18 am
Has thanked: 1365 times
Been thanked: 794 times

Re: Archiving TMF boards

#25264

Postby melonfool » January 22nd, 2017, 10:25 pm

Snorvey wrote:Didn't someone say it was something like 8gb of posting data on old Fool?

Can't someone just send them a couple of blank DVD's and ask them to backup the boards to them until we figure out what to do? *

Just askin' :-)



* I am not an expert in these matters y'understand.


Got to be more than 8gb, surely? I can download that in about ten minutes.

Mel

melonfool
Lemon Quarter
Posts: 2939
Joined: November 4th, 2016, 11:18 am
Has thanked: 1365 times
Been thanked: 794 times

Re: Archiving TMF boards

#25270

Postby melonfool » January 22nd, 2017, 10:48 pm

Snorvey wrote:I'm sure that's what was mentioned somewhere.....and I'm guessing with the Fool's old text only boards (no pics, videos etc) it might just be right enough.


In which case why would it be such a problem?

I downloaded loads (then got bored) and it didn't take long. Took longer trying to work out what to call them etc. But I do have a 50gb download speed which is fairly quick. I suspect when you open them they don't look like they used to though, I've not tried yet.

Mel

Gengulphus
Lemon Quarter
Posts: 4255
Joined: November 4th, 2016, 1:17 am
Been thanked: 2631 times

Re: Archiving TMF boards

#25370

Postby Gengulphus » January 23rd, 2017, 1:04 pm

Snorvey wrote:I'm sure that's what was mentioned somewhere.....and I'm guessing with the Fool's old text only boards (no pics, videos etc) it might just be right enough.

It may have been mentioned, but it was probably somebody's estimate of how much data there might be. Probably an underestimate by a factor of about 5, as the following quote (from the second page of this topic) looks to be based on fact:
stooz wrote:However we have been in discussions, lengthy and time consuming. We are looking at 40gb of data, many hours of conversion work, on going costs, continued legal protection costs and overall a bill over 5 figures... So don't get your hopes up.

Of course, even 40gb isn't by any means prohibitive - and for a backup-and-leave-the-rest-to-later approach, the conversion work should be deferrable until later. How much of the over-5-figure costs can be similarly deferred depends on exactly what they are - so they're the potential sticking point. Anyone who wants to investigate that needs to talk to TMF, stooz and/or someone else who was involved in the discussions stooz mentions.

Gengulphus


Return to “Room 102 - Site Issues, Complaints & General Chat”

Who is online

Users browsing this forum: No registered users and 10 guests