Page 1 of 1

Copyright-safe TMF post extractor

Posted: November 5th, 2016, 11:03 am
by Meatyfool
How's this for an idea?

Scraping technology is coded to extract posts from the tmf page that lists the users posts in time order NOT from a board list.

Bear with me.

This ensures that that the user/a willing volunteer can extract their posts in a copyright safe manner. TMF may permit said scraper to operate on their servers as they will have the tech at hand to prevent scraper bot from operating - they are obliged to protect the copyright?

I would expect that each extracted post would be given a unique primary key by the software. This along with

Board name
User name
Post date and time
Etc

Could be uploaded into a spreadsheet as an aid to rebuilding a thread.

Resultant data extract is then NOT loaded into Lemonfool.

Posts will only be added a board at a time.

So when enough users have contributed their posts to enable the "meat" of the board to be visible, the upload begins.


Additional proposal.

If we extracted board post lists too (which would be under TMF copyright), We could fill in blanks in a thread with a post purporting to be from that user with text "This TMF post was not retrieved".


Lastly, and not hopefully,

We could request permission from TMF to extract all posts under legal terms indicating that the posts would not be published until such time as the user approaches us and gives permission.

Identifying if they are the true user is difficult to achieve as anyone could go back to the internet archive and gather information from those posts to fool anyone checking identity.

And no way are TMF going to give us permission to have their customer database and could for email back password reminders. What if we agreed to buy it?

Just a few ideas, from possible to dumb!

Meatyfool