torrentio-scraper-backup/scraper
iPromKnight da997af64a
Move compose to top level
Comment out broken scrapers

Remove db init scripts and use ENABLE_SYNC env var which the code base uses to specify if Sequelize should create or alter tables on startup, meaning we dont need manual db initialization. There were missing tables anyhow :P
2024-01-18 08:54:57 +00:00
..
lib Updated scrapers to latest available commit 2024-01-17 16:43:58 -05:00
manual Added back original scrapers, integrated with PGSQL 2024-01-17 16:20:00 -05:00
scheduler Move compose to top level 2024-01-18 08:54:57 +00:00
scrapers Updated scrapers to latest available commit 2024-01-17 18:22:01 -05:00
Dockerfile Added back original scrapers, integrated with PGSQL 2024-01-17 16:20:00 -05:00
index.js Added back original scrapers, integrated with PGSQL 2024-01-17 16:20:00 -05:00
package-lock.json Updated scrapers to latest available commit 2024-01-17 16:43:58 -05:00
package.json Updated scrapers to latest available commit 2024-01-17 16:43:58 -05:00
README.md Added back original scrapers, integrated with PGSQL 2024-01-17 16:20:00 -05:00

Torrentio Scraper

Initial dumps

The Pirate Bay

https://mega.nz/#F!tktzySBS!ndSEaK3Z-Uc3zvycQYxhJA

https://thepiratebay.org/static/dump/csv/

Kickass

https://mega.nz/#F!tktzySBS!ndSEaK3Z-Uc3zvycQYxhJA

https://web.archive.org/web/20150416071329/http://kickass.to/api

RARBG

Scrape movie and tv catalog using www.webscraper.io for available imdbIds and use those via the api to search for torrents.

Movies sitemap

{"_id":"rarbg-movies","startUrl":["https://rarbgmirror.org/catalog/movies/[1-4235]"],"selectors":[{"id":"rarbg-movie-imdb-id","type":"SelectorHTML","parentSelectors":["_root"],"selector":".lista-rounded table td[width='110']","multiple":true,"regex":"tt[0-9]+","delay":0}]}

TV sitemap

{"_id":"rarbg-tv","startUrl":["https://rarbgmirror.org/catalog/tv/[1-609]"],"selectors":[{"id":"rarbg-tv-imdb-id","type":"SelectorHTML","parentSelectors":["_root"],"selector":".lista-rounded table td[width='110']","multiple":true,"regex":"tt[0-9]+","delay":0}]}

Migrating Database

When migrating database to a new one it is important to alter the files_id_seq sequence to the maximum file id value plus 1.

ALTER SEQUENCE files_id_seq RESTART WITH <last_file_id + 1>;