Back to Garden
Ecosystem|October 27, 2024

Exploring Real-Time Website Monitoring with AWS Lambda & DynamoDB: A Learning Journey

#AWS#Serverless#Observability#Database

In today’s fast-paced digital world, websites play a crucial role in communication with customers, partners, and employees. As a result, it…

In today’s fast-paced digital world, websites play a crucial role in communication with customers, partners, and employees. As a result, it is essential for organizations to ensure that their websites are up-to-date and provide accurate information. In an effort to gain hands-on experience with different services offered by AWS, a decision was made to explore the use of AWS Lambda & DynamoDB for real-time website monitoring and notification.

AWS Lambda is a serverless computing platform that enables the execution of code without having to manage the underlying infrastructure. It runs code in response to events, such as changes in DynamoDB data, S3 bucket uploads, or HTTP requests from Amazon API Gateway. DynamoDB is a fast and flexible NoSQL database service that provides consistent, single-digit millisecond latency at any scale.

By using AWS Lambda and DynamoDB, a real-time website monitoring and notification system can be created to alert stakeholders of any changes in the website’s information. The system can be set up to regularly scrape the website and compare the information to the previous scrape. If there are any changes, a notification can be sent to the relevant stakeholders.

Here is a sample code for the AWS Lambda function that can be used to scrape the website and compare the data:

const axios = require('axios');  
const cheerio = require('cheerio');  
const AWS = require('aws-sdk');  
exports.handler = async (event) => {  
    const url = 'https://www.example.com';  
    const dynamoDb = new AWS.DynamoDB.DocumentClient();  
    const res = await axios.get(url);  
    const $ = cheerio.load(res.data);  
    const currentData = \[\];  
    $('table tr').each((i, elem) => {  
        currentData.push({  
            date: $(elem).find('td').eq(0).text(),  
            title: $(elem).find('td').eq(1).text(),  
            description: $(elem).find('td').eq(2).text()  
        });  
    });  
    const previousData = await dynamoDb.get({  
        TableName: 'website_data',  
        Key: { id: 'previous_data' }  
    }).promise();  
    if (previousData.Item) {  
        const changes = compareData(previousData.Item.data, currentData);  
        if (changes.length) {  
            await sendNotification(changes);  
            await dynamoDb.put({  
                TableName: 'website_data',  
                Item: { id: 'previous_data', data: currentData }  
            }).promise();  
        }  
    } else {  
        await dynamoDb.put({  
            TableName: 'website_data',  
            Item: { id: 'previous_data', data: currentData }  
        }).promise();  
    }  
};  
function compareData(previous, current) {  
    // logic to compare previous data with current data  
    return changes;  
}  
function sendNotification(changes) {  
    // logic to send notification  

}

The above code demonstrates how AWS Lambda can be used to scrape a website, compare the data with previous scrapes, and send notifications if there are any changes. The function uses the Axios library to make a GET request to the website and the Cheerio library to extract data from the HTML response. The current data is then stored in a DynamoDB table for comparison with future scrapes. If there are any differences between the current and previous data, the function sends a notification to relevant stakeholders through the sendNotification function.

To schedule the Lambda function to run at regular intervals, AWS EventBridge can be used. EventBridge is a serverless event bus that enables applications to exchange data between AWS services and to process data streams in real-time. To schedule the Lambda function, a CloudWatch Event rule can be created that triggers the function at a specified time or interval.

With the real-time website monitoring and notification system, organizations can stay on top of any changes in their website and take prompt action if necessary. AWS Lambda and DynamoDB provide a cost-effective and scalable solution for monitoring websites and sending notifications in real-time.

In conclusion, AWS Lambda & DynamoDB are powerful tools for creating real-time website monitoring and notification systems. Through this hands-on experience, the potential of these services has been highlighted and their benefits demonstrated. This article serves as a guide for those looking to gain practical experience with AWS services while building a useful application.