My name's Szymon Rutkowski. I am interested in democratizing tools
for understanding the big flows of words that we produce.
Currently my main project is ActualScan. It's a search engine where users can crawl together
the websites that interest them and then apply various analyses to the collected index.
(Currently in alpha testing.)
If you're interested in working with me, here is a sample of fields/tools/techs that I've
interacted with -- often still do.
word sense disambiguation
- ActualScan (English) - an analytic search engine with social indexing that I'm building
- Ciesiołka Znaków (Polish) - my old blog mainly about applications of lingustics (word morphology) and machine learning in language processing
- The Old Republic (English, Polish) - explorations into controversies, ideologies and political debates in the Polish-Lithuanian Commonwealth, before 1795
- Estimating senses with sets of lexically related words for Polish word sense disambiguation (with P. Rychlik and A. Mykowiecka), GWC 10: ClarinPL
- Evaluation of basic modules for isolated spelling error correction in Polish texts, LTC 19: ArXiv
- History – 2020, Uniwersytet Warszawski:
Język laudów sejmikowych w latach 1572-1696 jako przedmiot badań komputerowych (Language of local assembly (sejmik) resolutions as a subject of computational research)
/I am also trained as a historian, most interested in civic republicanism and its roots in early modernity. My recent projects concern computer
processing of resolutions of sejmiks, "town meetings" of nobility in Polish-Lithuanian Commonwealth./
- Cognitive Science – 2018, Uniwersytet Warszawski:
Modele automatycznego poprawiania błędów w języku polskim (Models of automatic spelling correction for Polish)
(I also experimented with biological neural nets, as described here,
see also the repo)