Wilker Aziz
University of Sheffield
Research Associate
CV
Google scholar

Department of Computer Science
Regent Court, 211 Portobello
Sheffield, S1 4DP
United Kingdom
Email: w.aziz@sheffield.ac.uk
click here for a more recent picture

Publications Background Activities Interests Current Research Resources/Code Links

NEWS

[10/09/14] I gave a talk on decoding for SMT at the MT Marathon, check my slides ;)

[23/07/14] Our paper on exact decoding for PBSMT has been accepted at the EMNLP-14! Come check our talk :)

[01/07/14] In January 2015 I will be joining SLPLL led by Professor Khalil Sima'an at ILLC.


BACKGROUND


I am a Research Associate working with Dr. Lucia Specia on discourse modelling in statistical machine translation.
I have received a PhD from the University of Wolverhamtpon in 2014 on exact optimisation and sampling for SMT.
My advisors were Dr. Lucia Specia, Dr. Marc Dymetman and Professor Ruslan Mitkov.

Before that I was an intern at the Xerox Research Centre Europe (France) in 2009-2010 working on source context modelling for SMT and in the Fall of 2012 working on exact optimisation and sampling for phrase-based models.

I first obtained my B.Sc. degree in Computer Engineering from the Engineering School of the University of Sao Paulo, Brazil in 2010.
During my undergraduation I was a member of the Interinstitutional Center for Research and Development in Computational Linguistics (NILC) supervised by Ivandré Paraboni and Thiago Pardo.

INTERESTS


Statistical Machine Translation
Machine Learning


CURRENT RESEARCH


I am currently working on efficient inference with nonlocal parameterisation for SMT.


ACTIVITIES

Attending/Attended

Shared Tasks Reviewing/recently reviewed for

PUBLICATIONS


2014

Aziz, W.; Dymetman, M.; Specia, L. (2014). Exact Decoding for Phrase-Based Statistical Machine Translation. EMNLP-2014. (pdf, bibtex)

Aziz, W. (2014). Exact Sampling and Optimisation in Statistical Machine Translation. (thesis, errata, bibtex)

Aziz, W.; Koponen, M.; Specia, L. (2014). Sub-sentence Level Analysis of Machine Translation Post-editing Effort In Post-editing of Machine Translation: Processes and Applications, Chapter 8. (draft, bibtex)


2013

Aziz, W.; Dymetman, M.; Venkatapathy, S. (2013). Investigations in Exact Inference for Hierarchical Translation. In the Proceedings of the 8th Workshop on Statistical Machine Translation (WMT), pages 472-483, Sofia, Bulgaria. (pdf, bibtex)

Aziz, W.; Specia, L. (2013). Multilingual WSD-like Constraints for Paraphrase Extraction. In the Proceedings of the Seventeenth Conference on Computational Natural Language Learning (CoNLL), pages 202-211, Sofia, Bulgaria. (pdf, bibtex)

Aziz, W.; Mitkov, R.; Specia, L. (2013). Ranking Machine Translation Systems via Post-Editing. In Proceedings of Text, Speech and Dialogue (TSD 2013). Lecture Notes in Computer Science, pages 410-418, Pilsen, Czech Republic. Springer Verlag. The final publication is available at springer.com (draft, bibtex, data)


2012

Koponen, M.; Aziz, W.; Ramos, L.; Specia, L. (2012). Post-editing Time as a Measure of Cognitive Effort. In the AMTA 2012 Workshop on Post-Editing Technology and Practice (WPTP 2012), pages 11-20, San Diego, USA. (pdf, bibtex, data)

Rios, M.; Aziz, W.; Specia, L. (2012). UOW: Semantically Informed Text Similarity. In The First Joint Conference on Lexical and Computational Semantics -- Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012), pages 673--678, Montreal, Canada. (pdf, bibtex)

Aziz, W.; Sousa, S. C. M.; Specia, L. (2012). Cross-lingual Sentence Compression for Subtitles. In The 16th Annual Conference of the European Association for Machine Translation, EAMT ’12, pages 103--110, Trento, Italy. (pdf, bibtex, data)

Aziz, W.; Specia, L. (2012). PET: a Tool for Post-editing and Assessing Machine Translation. In The 16th Annual Conference of the European Association for Machine Translation, EAMT ’12, page 99, Trento, Italy. (pdf)

Aziz, W.; Sousa, S. C. M.; Specia, L. (2012). PET: a tool for post-editing and assessing machine translation. In The Eighth International Conference on Language Resources and Evaluation, LREC ’12, Istanbul, Turkey. (pdf, bibtex, PET, data, github)


2011

Specia, L.; Hajlaoui, N.; Hallett, C.; Aziz, W. (2011). Predicting Machine Translation Adequacy. In The Thirteenth Machine Translation Summit (MTSummit-2011), pages 513--520, Xiamen, China. (pdf, bibtex)

Aziz, W.; Rios, M.; Specia, L. (2011). Shallow Semantic Trees for SMT. In Proceedings of the Sixth Workshop on Statistical Machine Translation (WMT-2011), pages 316--322, Edinburgh, Scotland. (pdf, bibtex)

Rios, M.; Aziz, W.; Specia, L. (2011). TINE: A Metric to Assess MT Adequacy. In Proceedings of the Sixth Workshop on Statistical Machine Translation (WMT-2011), pages 116--122, Edinburgh, Scotland. (pdf, bibtex)

Aziz, W.; Rios, M.; Specia, L. (2011). Improving Chunk-based Semantic Role Labeling with Lexical Features. In Proceedings of the International Conference Recent Advances in Natural Language Processing 2011, pages 226--232, Hissar, Bulgaria. (pdf, bibtex)

Sousa, S. C. M.; Aziz, W.; Specia, L. (2011). Assessing the post-editing effort for automatic and semi-automatic translations of DVD subtitles. In Proceedings of the International Conference Recent Advances in Natural Language Processing 2011, pages 79--103, Hissar, Bulgaria. (pdf, bibtex)

Aziz, W.; Specia, L. (2011). Fully Automatic Compilation of Portuguese-English and Portuguese-Spanish Parallel Corpora. In Proceedings of the 8th Brazilian Symposium in Information and Human Language Technology (STIL-2011), Cuiaba, Brazil. (pdf, bibtex, data, more)


2010

Aziz, W.; Dymetman, M.; Mirkin, S.; Specia, L.; Cancedda, N.; Dagan, I. (2010). Learning an Expert from Human Annotations in Statistical Machine Translation: the Case of Out-of-Vocabulary Words. 14th Annual Conference of the European Association for Machine Translation (EAMT-2010), pages 28--35, Saint-Raphael, France. (pdf, bibtex)

Aziz, W.; Specia, L. (2010). Combining Dictionaries and Contextual Information for Cross-Lingual Lexical Substitution. ACL SigLex Workshop Evaluation Exercises on Semantic Evaluation (SemEval-2010), pages 117--122, Uppsala, Sweden. (pdf, bibtex)


RESOURCES/CODE

Code

Post-Editing Tool

Portuguese-English and Portuguese-Spanish parallel corpora


LINKS

Resources and Tools for Brazilian Portuguese: NILC