 |
How It Works
At DanceScoop, we’ve built an intelligent, end-to-end system that does the heavy lifting so you can focus on dancing. Here’s a quick look at how we deliver complete, tailored event information in just seconds:
|
 |
Data Collection & Extraction
Our system harnesses Python along with a powerful set of libraries—such as requests, playwright, pandas, scrapy, streamlit, fuzzywuzzy, regex, sqlalchemy, and more—to scour thousands of URLs over a 10-12 hour cycle. Using advanced AI models from OpenAI and Mistral, DanceScoop reads and interprets web pages, generating a raw JSON output that is meticulously cleaned and structured before being stored in our Postgres database.
|
 |
Deduplication & Relevance Filtering
Given the vast amount of data collected, duplication and off-target events are inevitable. We use a blend of python, regex and AI-driven techniques to deduplicate records and accurately populate key details (like the dance style). Our system smartly evaluates each event’s relevance, filtering out unrelated entries. For example, dismissing “a condiment trade event” from the database that had been scraped because they had salsa as one of their condiments! Our use of AIs means that only genuine dance events make the cut.
|
 |
Deployment & Global Access
Once processed, the source files are pushed to our GitHub repository and deployed to Render on AWS. Our architecture includes three dedicated servers – a web server, a backend, and a database server – that work together to serve the DanceScoop database globally, ensuring fast and reliable access.
|
 |
User-Friendly Interface
The final output is a streamlined, Google-like search interface powered by a natural language chatbot. Simply type or speak your query (e.g., “Where can I dance tonight?”), and our system generates the necessary SQL queries to retrieve exactly what you need in 5-10 seconds. If you can speak English, no training required. This minimal interface means you spend less time searching and more time enjoying your dance events.
|
 |
In a Nutshell
DanceScoop automates the complex process of web scraping, data cleaning, and filtering, so you get the full picture of the local dance scene instantly, without the hassle of visiting multiple websites. All you need to do is ask!
|