Full-Stack Web Scraping: Create Link Previews with Vite.js, React, and Node.js

Learn to build a full-stack app using Vite.js, React, and Node.js. This tutorial covers web scraping with Cheerio to create dynamic link previews, combining frontend and backend development seamlessly.

NodeJS

ReactJS

Web Scrapping

Fullstack Development

Full-Stack Web Scraping: Create Link Previews with Vite.js, React, and Node.js

Introduction

Web development is constantly evolving, and with tools like Vite.js and React, creating fast and responsive front-end applications has never been easier. But what happens when you need your app to fetch and display content from other websites? This is where web scraping comes in, and today, we're going to build a full-stack application that does just that.

In this tutorial, you'll learn how to create a dynamic link preview generator using React for the frontend and Node.js with Cheerio for the backend. This is a fantastic project for web developers who want to explore web scraping while working with modern, efficient tools like Vite and TypeScript.

What You'll Learn:

  • Setting up a Vite.js React project with TypeScript
  • Creating a Node.js server with Express
  • Using Axios and Cheerio for web scraping
  • Building a full-stack application in one cohesive project

Video Tutorial if you don't like to read complete blog

1. Setting Up Your Project

We'll start by setting up the project structure. In this tutorial, the frontend and backend will be housed within the same project directory. This setup makes development straightforward and keeps your project organized.

Begin by creating the ReactJS project with ViteJS and use Typescript template

Creating the React Frontend with Vite.js

Next, use Vite to scaffold the React frontend with TypeScript:

pnpm create vite@latest

This command sets up a new React project in a your-project directory, using TypeScript. Navigate to the your-project folder and install dependencies:

cd your-project
pnpm install

2. Setting Up the Node.js Server

Now that the frontend is ready, let's move on to creating a Node.js server. Start by creating a server directory and initializing a Node.js project:

cd ..
mkdir server
cd server
pnpm init

You'll need Express for the server, along with Axios for making HTTP requests, Cheerio for parsing HTML, body-parser to fetch JSON body from request and cors to enable CORS for API:

npm install express axios cheerio body-parser cors

3. Building the Web Scraping API

With the backend set up, we can create an API endpoint that accepts a URL, fetches its content, and extracts key metadata like the title, description, and image.

Here's the basic structure of the server in index.ts:

// index.js

const express = require("express");
const bodyParser = require("body-parser");
const cors = require("cors");

const { getUrlPreview } = require("./url.controller");

const app = express();
const PORT = process.env.SERVER_PORT || 5005;

app.use(bodyParser.json());
app.use(cors());

app.get("/health", (req, res) => {
  return res.status(200).json({ status: "Server Running" });
});
app.post("/preview", getUrlPreview);

app.listen(PORT, () => {
  console.log("Server is running: %s", PORT);
});
 // url.controller.js

const axios = require("axios");
const cheerio = require("cheerio");
const { object, string, ValidationError } = require("yup");

const schema = object({
  url: string().url().required(),
});

const getUrlPreview = async (req, res) => {
  try {
    const value = await schema.validate(req.body);

    const { data } = await axios.get(value.url);
    const $ = cheerio.load(data);

    const title =
      $('meta[property="og:title"]').attr("content") || $("title").text();
    const description =
      $('meta[property="og:description"]').attr("content") ||
      $('meta[property="description"]').attr("content");
    const image =
      $('meta[property="og:image"]').attr("content") ||
      $("img").first().attr("src");

    const previewData = {
      title: title || "No title available",
      description: description || "No description available",
      image: image || "No image available",
    };

    return res.status(200).json(previewData);
  } catch (err) {
    if (err instanceof ValidationError) {
      return res.status(422).send(err.message);
    }

    console.log(err);

    return res.status(500).send("Something went wrong!");
  }
};

module.exports = {
  getUrlPreview,
};

This code sets up a simple Express server that listens for POST requests at /api/preview. When a request is made with a URL, the server fetches the HTML content of that URL using Axios and parses it with Cheerio. The metadata is then extracted and returned to the client.

In the React app, create a component that will take a URL as input and display the preview fetched from the backend.

Here’s how you can implement the App component for handling Link Preview Generator:

// App.tsx

import { zodResolver } from "@hookform/resolvers/zod";
import { useState } from "react";
import { useForm } from "react-hook-form";
import { z } from "zod";
import axios from "axios";
import { Loader } from "./components/shared/loader.tsx";
import { Preview } from "./components/shared/preview.tsx";

import { Button } from "./components/ui/button.tsx";
import {
  Form,
  FormControl,
  FormField,
  FormItem,
  FormMessage,
} from "./components/ui/form.tsx";
import { Input } from "./components/ui/input.tsx";

export type DataType = {
  title: string;
  description: string;
  image: string;
};

const schema = z.object({
  url: z.string().url(),
});

function App() {
  const [isLoading, setLoading] = useState<boolean>(false);
  const [data, setData] = useState<DataType | null>(null);

  const form = useForm<z.infer<typeof schema>>({
    resolver: zodResolver(schema),
  });

  const onSubmit = async (values: z.infer<typeof schema>) => {
    try {
      setLoading(true);
      setData(null);

      const { data } = await axios.post(
        "http://localhost:5005/preview",
        values,
      );
      setData(data);
      form.setValue("url", "");
    } catch (err) {
      console.log(err);
    } finally {
      setLoading(false);
    }
  };

  return (
    <div className="min-h-screen flex flex-col items-center w-full lg:w-1/2 mx-auto p-10">
      <h1 className="my-10 text-3xl font-bold">Generate Preview for Link</h1>

      <Form {...form}>
        <form
          onSubmit={form.handleSubmit(onSubmit)}
          className="flex w-full gap-2 items-center mb-10"
        >
          <FormField
            render={({ field }) => (
              <FormItem className="flex-1">
                <FormControl>
                  <Input {...field} placeholder="Enter Url" />
                </FormControl>
                <FormMessage />
              </FormItem>
            )}
            name="url"
            control={form.control}
          />
          <Button type="submit">Generate</Button>
        </form>
      </Form>

      {isLoading && <Loader />}

      {data && !isLoading && (
        <Preview
          title={data.title}
          description={data.description}
          image={data.image}
        />
      )}
    </div>
  );
}

export default App;
// Previewer.tsx

import { DataType } from "../../App.tsx";
import {
  Card,
  CardContent,
  CardDescription,
  CardHeader,
  CardTitle,
} from "../ui/card.tsx";

export const Preview = ({ title, description, image }: DataType) => (
  <Card className="shadow-lg">
    <CardHeader>
      <CardTitle className="leading-8 border-b pb-2 mb-3">{title}</CardTitle>
      <CardDescription>{description}</CardDescription>
    </CardHeader>

    <CardContent>
      <img src={image} alt={title} className="object-cover w-full" />
    </CardContent>
  </Card>
);

This component allows users to enter a URL, which is then sent to the backend to fetch and display the link preview.

5. Running the Application

Finally, to run the application, you need to start both the frontend and backend servers:

Start the Node.js server:

cd server
npm run dev

Start the Vite React frontend:

cd ../client
npm run dev

Navigate to http://localhost:5173, and you'll see your app in action, allowing users to enter a URL and generate a link preview.

Conclusion

In this tutorial, we combined the power of Vite.js, React, Node.js, and Cheerio to create a full-stack application capable of web scraping. Whether you’re looking to create a personal project or add a new skill to your portfolio, understanding how to integrate frontend and backend in a single project is invaluable.

Remember, while web scraping is a powerful tool, it’s essential to use it responsibly. Always respect the terms of service of the websites you scrape, and consider the ethical implications.

If you found this tutorial helpful, don’t forget to subscribe to my channel for more content like this, and drop a comment if you have any questions or suggestions for future tutorials. Happy coding!


Follow me for more content like this:


Get latest updates

I post blogs and videos on different topics on software
development. Subscribe newsletter to get notified.


You May Also Like

Build a MERN Stack File Upload App with Progress Bar and Metadata Storage

Build a MERN Stack File Upload App with Progress Bar and Metadata Storage

Learn how to create a MERN stack file upload app with real-time progress tracking and metadata storage in MongoDB.

Express.js Crash Course: Build a RESTful API with Middleware

Express.js Crash Course: Build a RESTful API with Middleware

Learn to build a RESTful API using Express.js, covering middleware, routing, and CRUD operations in just 30 minutes.

Can Next.js Replace Your Backend? Pros, Cons, and Real-World Use Cases

Can Next.js Replace Your Backend? Pros, Cons, and Real-World Use Cases

Explore whether Next.js can fully replace your backend server. Learn about the advantages, limitations, and use cases for using Next.js as a full-stack solution for modern web development projects.