Generative AI example using BART model - Project using Python

This Python program demonstrates the use of the Hugging Face Transformers library to perform text generation using a pre-trained BART (Bidirectional and Auto-Regressive Transformers) model. The program begins by importing the necessary components, such as `BartForConditionalGeneration` for the model and `BartTokenizer` for text tokenization. The primary goal is to generate text based on a given prompt, making it a versatile tool for tasks like text completion, summarization, and creative writing.

The program loads a pre-trained BART model, "facebook/bart-large-cnn," and its corresponding tokenizer. The model is capable of understanding and generating text in a coherent and contextually meaningful manner. The code defines a Python function, `generate_text`, which serves as the core of the text generation process. It takes a user-provided prompt as input and generates text based on this prompt.

Within the `generate_text` function, the prompt is first tokenized using the BART tokenizer. Tokenization is a crucial step in converting text into a format that the model can understand. Once tokenized, the model generates text that extends the provided prompt. The generated text can be controlled in terms of maximum length, and the program can generate multiple sequences if needed.

The key concepts used for text generation within this program are "no_repeat_ngram_size" and "top_k." These parameters help in controlling the generated text's quality and diversity. For instance, "no_repeat_ngram_size" ensures that the same phrases don't appear multiple times in the generated text, while "top_k" restricts the vocabulary choices for the generated text.

The program is structured to allow users to experiment with various prompts and control parameters to obtain different outputs. In this particular example, the prompt "Google was created by" is provided, and the generated text is printed to the console.

By exploring and running this program, users can gain insights into advanced natural language processing techniques, including the use of powerful pre-trained models for text generation. It offers a hands-on experience for understanding how large language models can be leveraged for creative and practical purposes, making it a valuable tool for both learning and real-world applications in natural language processing.


from transformers import BartForConditionalGeneration, BartTokenizer

# Load the BART model and tokenizer
model_name = "facebook/bart-large-cnn"
model = BartForConditionalGeneration.from_pretrained(model_name)
tokenizer = BartTokenizer.from_pretrained(model_name)

def generate_text(prompt, max_length=300, num_return_sequences=1):
    input_ids = tokenizer.encode(prompt, return_tensors="pt", max_length=1024, truncation=True)
    generated_ids = model.generate(input_ids, max_length=max_length, num_return_sequences=num_return_sequences, no_repeat_ngram_size=2, top_k=50)
    generated_texts = [tokenizer.decode(generated_id, skip_special_tokens=True) for generated_id in generated_ids]
    return generated_texts
    
prompt = "Google was created by "

# Generate text
generated_text = generate_text(prompt,max_length=500)

for i, text in enumerate(generated_text):
    print(f"Generated Text {i + 1}:")
    print(text)
    print()
The output of the program will be like this "Google was created by Google founder Sergey Brin. The search giant was founded in 1998 in Mountain View, California. Google is one of the most popular search engines in the world, with more than 100 million searches a day. Click through the gallery to see some of Google's most memorable moments."

Post a Comment

Previous Post Next Post