The Baltimore Banner: AWS-based data and AI solutions driving subscription growth
The Baltimore Banner is a Pulitzer Prize-winning news platform with over 60,000 paying subscribers, accounting for almost half of its revenue. With subscriptions integral to its growth strategy, the company engaged AgileEngine in the development of a performant, GenAI-ready data infrastructure for content and audience analysis. Leveraging AWS, we’ve built a strong foundation for our client’s data-driven growth, bringing speed and cost efficiency to critical analytics workflows.
Industries
Digital media, Entertainment, Subscription
Services
Data engineering, AI engineering, DevOps
Solutions
Data pipeline, Data lake, Cloud, CI/CD, AI, Data visualization, Business intelligence
Technologies
AWS, Bedrock, Step Functions, Lambda, S3, Glue, Athena, Anthropic Claude Sonnet 3.7., BigQuery, Apache Airflow, Arc XP, Power BI, Amazon Titan, Mistral AI
AI tools that powered our workflow
Copilot, Cursor, ChatGPT
Outcomes
and highlights
- 3 weeks from project start to release for the AI-driven story classification system
- 70% faster AI-driven content analysis compared to human performance
- 99% cost savings on classification tasks compared to human-driven workflows
- 93% accuracy achieved for content classification thanks to AI
- 99% of business intelligence reports migrated to an automated solution
Solutions overview
AWS-based solutions enabling granular, AI-powered content analysis
Our team introduced an AWS-based data lake and optimized the company’s ETL and ELT pipelines, creating a modern cloud-native ecosystem for content analytics. Using this ecosystem, we modernized the client’s reporting capabilities and developed an AI-driven content categorization tool based on AWS Bedrock and Claude Sonnet 3.7. Developed in weeks, this AI tool uses AWS Step Functions, Lambda, S3, Glue, and Athena.
Key deliverables
- Data lake architecture based on AWS Lake Formation
- GenAI-driven content classification and analysis system
- Self-service solution enabling the analytics team and other stakeholders to create custom views, models, and dashboards with the data from the data lake
- Optimization of ETL/ELT pipelines for cost savings and maintainability
- Infrastructure setup for the orchestration and scheduling of data pipelines
- Consolidation of KPIs from multiple APIs and data warehouses for app usage reporting
- Migration of 99% of manual reports to a Power BI solution with custom workspaces, datasets, and dashboards, as well as a centralized hub with metrics and KPIs
- Disaster recovery plan and safe vault solution for the Arc XP content management environment
- CI/CD pipeline for moving ETL jobs and Airflow DAGs from GitHub to AWS using S3
Technologies
AWS, Bedrock, Step Functions, Lambda, S3, Glue, Athena, Anthropic Claude Sonnet 3.7, BigQuery, Apache Airflow, Arc XP, Power BI
Power BI for reporting
A crucial modernization introduced by our experts covers the client’s business intelligence capabilities. AgileEngine migrated the company’s manual BI reports to a self-service solution, complete with feature-rich workspaces, custom datasets, and a centralized hub featuring key dashboards and KPIs.
Key deliverables
- Power BI analytics ecosystem with organized workspaces and datasets for enterprise-wide access
- Self-service analytics platform enabling stakeholders to create custom data views and models
- Migration of 99% of manual reports to a Power BI solution
- Centralized dashboard hub featuring the flagship executive reporting suite
- Specialized deep-dive dashboards for retention analysis with cohort segmentation capabilities
- Automated dashboard distribution system accessible by stakeholders without Power BI licenses
- Expanded KPI coverage and full automation for one of the client’s main dashboards
- Pre-calculated metrics datasets speeding up dashboard creation and analysis
Technologies
Power BI, AWS
AI-driven content categorization system
Our custom AI solution pushed the content analytics of Baltimore Banner beyond the company’s legacy CMS capabilities. AgileEngine delivered the solution in three weeks while utilizing AWS Step Functions and Lambda to call the Bedrock Flow for document classification, as well as S3, Glue, and Athena to analyze the results.
Key deliverables
- GenAI-driven content classification and analysis system
- Automated content tagging system replacing limited CMS taxonomy with intelligent categorization
- Data lake infrastructure optimized for AI/ML workloads and model training
- ETL/ELT pipelines engineered for AI data preprocessing and feature engineering workflows
- Evaluation and testing of six different LLMs in parallel in order to find the most accurate option
- Eight iterations on GenAI prompting, refining definitions from basic two-sentence descriptions to comprehensive explanations
Technologies
AWS, Bedrock, Step Functions, Lambda, S3, Glue, Athena, Anthropic Claude Sonnet 3.7, Amazon Titan, Mistral AI