Top 500 Pharma Company

Web-Sourced Intelligence for Pharma Strategy

A targeted approach to competitor tracking using data extraction and AI-driven analysis

data science | artificial intelligence
[J&J 2] Image for Header
[J&J 2] Image for Header
Top 500 Pharma Company

Web-Sourced Intelligence for Pharma Strategy

A targeted approach to competitor tracking using data extraction and AI-driven analysis

data science | artificial intelligence
[J&J 2] Image for the challenge

The Challenge

To maintain a competitive edge in a rapidly evolving pharmaceutical landscape,  a Top 500 pharma company sought to gather current, high-value intelligence about its key industry competitors. The goal: extract relevant content from official company websites of 13 major pharma organizations and transform that data into a structured, searchable format. The focus areas included therapeutic modalities, innovation sourcing and acquisitions, data science initiatives, and disease targets.

Given the wide variation in website structures and publication formats, this required not just web scraping, but also precise parsing, tagging, and validation. All content had to be recent (published from 2021 onward), and submissions needed to be reproducible, scalable, and transparent in their data sourcing.

The Solution

Topcoder launched a focused Innovation Challenge, inviting global data scientists to build a scalable web scraping pipeline. Participants extracted recent content from official pharma websites and structured it into JSONL format, tagging competitor mentions, key topics, and publication metadata. Each submission included Dockerized code, a domain manifest, and reproducible instructions. The top solution provided the customer with a clean, trustable dataset ready for search and analysis using OpenSearch.

Challenge we ran:

Innovation Series IC9: Pharma Competitor Insight Quest - Data extraction from Company Websites

6

Days

 

57

Participants

 

21

Submissions

[J&J 2] Image for the Impact

The Impact

The customer gained a ready-to-use data pipeline for monitoring competitor activity directly from official sources. This allowed internal teams to spot emerging trends, track innovation directions, and benchmark against peers in real time.

The solution significantly reduced manual effort and improved the reliability of competitive insights—laying the foundation for scalable intelligence gathering moving forward.

Achieve high-quality outcomes with

Topcoder.

Achieve high-quality outcomes with Topcoder.

 

Talk to an expert