Itv0060 2014 arhiiv

Allikas: Lambda
teadmised

Code: ITV0060
Link: http://www.lambda.ee/index.php/Teadmiste_otsing,_formaliseerimine_ja_hoidmine or http://www.lambda.ee/index/itv0060
Lecturer: Tanel Tammet
Contact: tanel.tammet@ttu.ee, 6203457, TTÜ AK223
Archives of previous years: 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, older.


NB! See on 2014 aasta arhiiv, mitte hetkel käsiloleva kursuse materjalikogu

  • 14. mai vaatame üle eksamimaterjalid + indeksite loeng.
  • 16. mai on viimane ametlik prakside esitamise päev, aga ...
  • 21. mai on varupäev prakside jaoks + indeksite loeng.


Eksamil neli küsimust väikeste ülesannete näol: igast suurest allolevast teemast üks.

Exam times:

  • 26. May, Monday: in room ICT-A2 at 10:00
  • 3. June, Tuesday: in ICT-A2 at 10:00


Time, place, result

Semester: spring
Grading: exam

Lectures: every Wednesday 16:00-17.30, room ICT-A2
Practical work: Fridays on odd weeks (7. Feb, 21 Feb, ...) 12:00-13:30 room ICT-403
Practical work will give 40% and exam 60% of points underlying the final grade.

Focus

The main focus of the course is on KR (knowledge representation): how to represent nontrivial information in programs and databases, how to build and use indexes for efficient search through large sets of knowledge.

The course contains four blocks built on each other:

  • Background and basics. Representing simple facts.
  • Representing rules.
  • Time, planning and uncertain knowledge.
  • Indexes and search.

Ca half of the course themes are covered in this book.

Practical work

There are three labs: the first two are obligatory, third is optional. The labs have to be presented to the prof and all students present at labwork.

First lab 2014

The goal of the first lab is to write a software system able to scrape factual raw data about a person with a name given to the program.

Basically, search for web pages containing the name and extract relevant words from the list you create: words closer to the name and with more occurrences are also more important/better match. It may be a good idea to do some searches together with interesting keywords already. Also, whenever you do a search / pull a page, it is a good idea to store the search result / html source in a file to avoid exchausting your search quota and just wasting bandwidth and time,

Deadline: recommended mid-March, absolutely latest end of March (after this there will be a penalty).

See also notes for KR lab 1.

Second lab 2014

The goal of the second lab is to write and use rules to categorize/tag people according to the data obtained during the first lab. A person should get a number of tags with numerical indicators showing our trust in that the tag really applies to the person, plus (sometimes) a number indicating the degree to which the tag applies.

As a concrete scenario imagine we want to show ads to the people and decide the concrete ads based on the tags/categories of people.

See details for KR lab 2.

Third lab 2014

This lab is optional and will simply give as many points as lab 1 or 2 towards the final result and the grade: practical wotrk 60% and exam 40%.

The goal of the lab is to use wordnet or teksaurus to create an additional ruleset and to use that in addition to your own rules in lab 2.

Student ideas are also welcome: have to agree with Tanel first.

Lecture block 1: basics and representing simple facts.

Overview of the course. Background and basics. First lab.

Lecture materials:


Programming and databases. SQL: meaning and representation of facts


Core ideas of non-relational databases, mostly RDF

Lecture materials:

HTML annotations. Microformats, microdata, RDFa

Lecture materials:


Understand main parts (not part of exam):

Data extraction from the web

not part of exam

  • Overview
  • Some details about Nell as a nice example.

Lecture block 2: representing rules

RDFS and logic

Understand RDFS:

Lecture material:

Additional details (not part of exam):

Important KR languages

Not necessary for the exam.

Several languages:

  • RDFS
  • OWL
  • KIF
  • CL
  • ontologies
  • wordnet
  • cyc
  • restricted english systems
  • frame systems

See also:

RDFs and OWL

Understand basics of owl:

not part of exam:

Owl background: description logics (not part of exam )

Start looking at interesting ontologies:

Restricted english

Attempto details not necessary for the exam.

Lecture block 3: time, planning and uncertain knowledge

Rules in planning and robotics

Lecture material:

See also (not part of exam):

Logic for uncertain knowledge

For the exam: you should be able to create and solve small examples with default logic.

Fuzzy and probabilistic logic

For the exam: understand the differences between fuzzy and probabilistic logic and be able to present small examples.

Logic of belief and knowledge

For exam: understand referential transparency and core ideas about encoding belief and knowledge. Modal logic not necessary for the exam.

Lecture block 4: indexes and search

Indexes: intro and mainstream

Traditional database indexes incl B+ tree:

Hash indexes:

Bitmap indexes:

For exam: understand the core usage scenarios and be able to create small examples.

Multi-field and geoindexes, fulltext and term indexes

not part of exam:

Fancier term indexes

  • Path indexes
  • On from here.

Nosql indexes

  • Document bases
  • Graph bases



teadmised_note