torrentio-scraper-backup/scraper
2024-01-19 00:38:00 -05:00
..
lib Updated scrapers to latest available commit 2024-01-17 16:43:58 -05:00
manual Added back original scrapers, integrated with PGSQL 2024-01-17 16:20:00 -05:00
scheduler Move compose to top level 2024-01-18 08:54:57 +00:00
scrapers updated urls for torrent9 2024-01-19 00:38:00 -05:00
Dockerfile Added back original scrapers, integrated with PGSQL 2024-01-17 16:20:00 -05:00
index.js Added back original scrapers, integrated with PGSQL 2024-01-17 16:20:00 -05:00
package-lock.json Updated scrapers to latest available commit 2024-01-17 16:43:58 -05:00
package.json Updated scrapers to latest available commit 2024-01-17 16:43:58 -05:00
README.md Added back original scrapers, integrated with PGSQL 2024-01-17 16:20:00 -05:00

Torrentio Scraper

Initial dumps

The Pirate Bay

https://mega.nz/#F!tktzySBS!ndSEaK3Z-Uc3zvycQYxhJA

https://thepiratebay.org/static/dump/csv/

Kickass

https://mega.nz/#F!tktzySBS!ndSEaK3Z-Uc3zvycQYxhJA

https://web.archive.org/web/20150416071329/http://kickass.to/api

RARBG

Scrape movie and tv catalog using www.webscraper.io for available imdbIds and use those via the api to search for torrents.

Movies sitemap

{"_id":"rarbg-movies","startUrl":["https://rarbgmirror.org/catalog/movies/[1-4235]"],"selectors":[{"id":"rarbg-movie-imdb-id","type":"SelectorHTML","parentSelectors":["_root"],"selector":".lista-rounded table td[width='110']","multiple":true,"regex":"tt[0-9]+","delay":0}]}

TV sitemap

{"_id":"rarbg-tv","startUrl":["https://rarbgmirror.org/catalog/tv/[1-609]"],"selectors":[{"id":"rarbg-tv-imdb-id","type":"SelectorHTML","parentSelectors":["_root"],"selector":".lista-rounded table td[width='110']","multiple":true,"regex":"tt[0-9]+","delay":0}]}

Migrating Database

When migrating database to a new one it is important to alter the files_id_seq sequence to the maximum file id value plus 1.

ALTER SEQUENCE files_id_seq RESTART WITH <last_file_id + 1>;