# Setup and Prerequisites Before you start building scrapers, you need to set up your development environment and understand the testing workflow. ## Environment Setup ### 1. Create Environment File Create a `.env` file in the root of the repository with the following variables: ```env MOVIE_WEB_TMDB_API_KEY = "your_tmdb_api_key_here" MOVIE_WEB_PROXY_URL = "https://your-proxy-url.com" # Optional ``` **Getting a TMDB API Key:** 1. Create an account at [TheMovieDB](https://www.themoviedb.org/) 2. Go to Settings > API 3. Request an API key (choose "Developer" for free usage) 4. Use the provided key in your `.env` file **Proxy URL (Optional):** - Useful for testing scrapers that require proxy access - Can help bypass geographical restrictions during development - If not provided, the library will use default proxy services ### 2. Install Dependencies Install all required dependencies: ```sh pnpm install ``` ## Familiarize Yourself with the CLI The library provides a CLI tool that's essential for testing scrapers during development. Unit tests can't be made for scrapers due to their unreliable nature, so the CLI is your primary testing tool. ### Interactive Mode The easiest way to test is using interactive mode: ```sh pnpm cli ``` This will prompt you for: - **Fetcher mode** (native, node-fetch, browser) - **Scraper ID** (source or embed) - **TMDB ID** for the content (for sources) - **Embed URL** (for testing embeds directly) - **Season/episode numbers** (for TV shows) ### Command Line Mode For repeatability and automation, you can specify arguments directly: ```sh # Get help with all available options pnpm cli --help # Test a movie scraper pnpm cli --source-id catflix --tmdb-id 11527 # Test a TV show scraper (Arcane S1E1) pnpm cli --source-id zoechip --tmdb-id 94605 --season 1 --episode 1 # Test an embed scraper directly with a URL pnpm cli --source-id turbovid --url "https://turbovid.eu/embed/DjncbDBEmbLW" ``` ### Common CLI Examples ```sh # Popular test cases pnpm cli --source-id catflix --tmdb-id 11527 # The Shining pnpm cli --source-id embedsu --tmdb-id 129 # Spirited Away pnpm cli --source-id vidsrc --tmdb-id 94605 --season 1 --episode 1 # Arcane S1E1 # Testing different fetcher modes pnpm cli --fetcher native --source-id catflix --tmdb-id 11527 pnpm cli --fetcher browser --source-id catflix --tmdb-id 11527 ``` ### Fetcher Options The CLI supports different fetcher modes: - **`native`**: Uses Node.js built-in fetch (undici) - fastest - **`node-fetch`**: Uses the node-fetch library - **`browser`**: Starts headless Chrome for browser-like environment ::alert{type="warning"} The browser fetcher requires running `pnpm build` first, otherwise you'll get outdated results. :: ### Understanding CLI Output #### Source Scraper Output (Returns Embeds) ```sh pnpm cli --source-id catflix --tmdb-id 11527 ``` Example output: ```json { embeds: [ { embedId: 'turbovid', url: 'https://turbovid.eu/embed/DjncbDBEmbLW' } ] } ``` #### Embed Scraper Output (Returns Streams) ```sh pnpm cli --source-id turbovid --url "https://turbovid.eu/embed/DjncbDBEmbLW" ``` Example output: ```json { stream: [ { type: 'hls', id: 'primary', playlist: 'https://proxy.fifthwit.net/m3u8-proxy?url=https%3A%2F%2Fqueenselti.pro%2Fwrofm%2Fuwu.m3u8&headers=%7B%22referer%22%3A%22https%3A%2F%2Fturbovid.eu%2F%22%2C%22origin%22%3A%22https%3A%2F%2Fturbovid.eu%22%7D', flags: [], captions: [] } ] } ``` **Notice the proxied URL**: The `createM3U8ProxyUrl()` function creates URLs like `https://proxy.fifthwit.net/m3u8-proxy?url=...&headers=...` to handle protected streams. Read more about this in [Advanced Concepts](/in-depth/advanced-concepts). #### Interactive Mode Flow ```sh pnpm cli ``` ``` ✔ Select a fetcher mode · native ✔ Select a source · catflix ✔ TMDB ID · 11527 ✔ Media type · movie ✓ Done! { embeds: [ { embedId: 'turbovid', url: 'https://turbovid.eu/embed/DjncbDBEmbLW' } ] } ``` ## Development Workflow 1. **Setup**: Create `.env` file and install dependencies 2. **Research**: Study the target website's structure and player technology 3. **Code**: Build your scraper following the established patterns 4. **Register**: Add to `all.ts` with unique rank 5. **Test**: Use CLI to test with multiple different movies and TV shows 6. **Iterate**: Fix issues and improve reliability 7. **Submit**: Create pull request with thorough testing documentation ## Next Steps Once your environment is set up: 1. Read [Provider System Overview](/in-depth/provider-system) to understand how scrapers work 2. Learn [Building Scrapers](/in-depth/building-scrapers) for detailed implementation guide 3. Check [Advanced Concepts](/in-depth/advanced-concepts) for error handling and best practices ::alert{type="info"} Always test your scrapers with multiple different movies and TV shows to ensure reliability across different content types. ::