🎄12 Days of HPC 2021


Blog post number 5 in our 12 days of HPC series from School of languages, culture and societies, Centre for translation studies!

During the month of December we’re featuring blog posts from researchers from across the University of Leeds showcasing the fantastic work they do using our High Performance Computing system. Follow us @RC_at_Leeds to keep up to date with our 12 days of HPC blog series.

What’s your name?

Nouran Khallaf

What department do you work in?

School of languages, culture and societies, Centre for translation studies

What research question are you trying to answer?

My PhD research project is aimed at making an Arabic lexical simplification system that will be used by a wide range of users. The research questions that need to be investigated are: how text complexity/readability can be measured; what are the linguistic phenomena that make Arabic text complex; what are the principles of Arabic lexical simplification; and why some texts are difficult to simplify.

What tools or technologies do you use in your research? (Programming languages, packages, APIs)

Python, Pytorch, Transformers, Natural Language Processing Toolkit (NLTK), Scikit-learn,
Arabic linguistic analysis tools (MADAMIRA, CamelParser)

How does HPC help your research?

It allowed the analysis of the large corpora used, experimenting with different NLP techniques to build a robust Arabic sentence readability classifier in a faster way. While training a large feature set with weighted vectors represents sentence embedding.

What is the potential impact of your research?

This research will provide an Arabic sentence readability classifier and the first Arabic lexical simplification model performed using the latest NLP techniques. The sentence readability classification system would be also another resource that could be used by researchers or Arabic second language tutors to select the appropriate text for their purposes. These applications will pave the way for the extension and consistent improvement of the current Arabic NLP research.

In your personal opinion what’s the coolest thing about your research?

I think that the coolest thing is trying to answer some questions in a very active area now in NLP that is text simplification to find ways of making texts easier to understand, along with applying new techniques on Arabic.

In your opinion, what is the ultimate Christmas song?

all i want for christmas is you

System graphic
Readability sentence classification