Muhammad Raafey Tariq · imagine-talent

Email

—

Phone

—

GitHub

—

Academic

Program

—

CGPA

—

Year

2020

Education

—

Address

—

DOB

—

Career

Current role

—

Target role

—

Skills

Python, TensorFlow, Gensim, Google Colab, Google Cloud, Word2Vec, GloVe, fastText, ELMo, BERT, Natural Language Processing

Verbatim text

The exact text the LLM saw on the page (or the booklet text from the old import). This is what powers semantic search.

Comparative Analysis of Different Word Embedding Techniques

In this project, we trained different word-embedding models using the Word2Vec, GloVe, fastText,
ELMo and BERT techniques on Urdu and Roman Urdu corpora. Each model was trained using the
same set of parameters. The embeddings learnt by these models were evaluated using a number of
Natural Language Processing tasks. Our comparison consisted of both a Qualitative Analysis as well
as a Quantitative Analysis.
For the Qualitative analysis, we plotted clusters of words regarded to be similar by these models
and drew conclusions based on some characteristics of those graphs. For the Quantitative Analysis, we used the models to perform a Word Similarity task on the WordSim-353 and SimLex-999
benchmark datasets. A Spearman’s Correlation Coefficient was calculated between the similarity
scores given in the datasets and the ones generated by these models.
Furthermore, the Roman Urdu models were also evaluated using a Sentiment Analysis task over a
Roman Urdu Tweets dataset. The embeddings from these models were fed to a Neural Network
which was fine-tuned for this task. The BERT model trained over Urdu was also evaluated using a
Sentence Classification task on the XNLI dataset. Our model was able to reach the baseline score of
56.6, which was achieved by the Google Research Group’s Urdu model.

Technology Used:
Python, TensorFlow, Gensim,
Google Colab, Google Cloud
Supervisor Name:
Ms. Mehreen Alam
Group Members:
Abubakar Ijaz (i16 - 0123)
Ali Nauman Qureshi (i16 - 0138)
Muhammad Raafey Tariq (i16 - 0259)

AI enrichment

Muhammad Raafey Tariq is a student who conducted a comparative analysis of word embedding techniques including Word2Vec, GloVe, and BERT on Urdu and Roman Urdu corpora. The project involved training models, performing qualitative and quantitative evaluations, and achieving baseline scores in sentiment analysis and sentence classification tasks.

Skills (AI)

["Python", "TensorFlow", "Gensim", "Natural Language Processing", "Word Embeddings", "BERT", "Word2Vec", "GloVe", "fastText", "ELMo", "Sentiment Analysis", "Sentence Classification", "Google Cloud", "Google Colab"]

Status: ai_done

Provenance

Source file: Graduate Directory FAST School of Computing 2020 (Final Complete) (1).pdf
From job #23 page 188
Created: 1778170703