Academic year 2014-15

Information Technologies

Degree: Code: Type:
Bachelor's Degree in Computer Science 21419 Compulsory subject, 2nd year
Bachelor's Degree in Telematics Engineering 21771 Optional subject
Bachelor's Degree in Audiovisual Systems Engineering 21658 Optional subject

 

ECTS credits: 4 Workload: 100 hours Trimester: 3rd

 

Department: Dept. of Information and Communication Technologies
Coordinator: Ricardo Baeza Yates
Teaching staff:
Language:
Timetable:
Building: Communication campus - Poblenou

 

Introduction

This course aims to introduce the basic concepts of information retrieval and their application to the technology of Web search engine. Emphasis will be given to retrieval models and most important evaluation techniques, including techniques for building hierarchies of Web pages. We will introduce the most widely used text index, the inverted index, how to build it and how to use it. We will also study how to solve scalability issues when the volume of data and queries increases. Finally, these concepts will be applied to other types of structured data such as XML text and multimedia, as well as their application to specific domains, such as legal and health corpora.

 

Prerequisites

Knowledge about programming and basic data structures. It is also recommended to know string processing operations, along with notions of XML, algorithms and machine learning. Advanced knowledge in the use of Web search engines is also implied

 

Associated competences

Skills to work on in the course as indicated in the curriculum of the degree.

Competencias transversalesCompetencias específicas

Instrumental

G1. Capacity for analysis and synthesis

G2. Ability to organize and plan

G3. Ability to apply knowledge to analyze situations and solve problems

G4. Ability in searching and information management

G5. Ability in decision making

G6. Capability to communicate properly both oraly and in writing, in Catalan and Spanish, both to expert and inexperienced audiences

Interpersonal

G8. Ability to teamwork

Systemic

G14. Capability to motivate for quality and achievement

 

Specific Professional Skills

H4. Independent learning new skills and techniques suitable for the conception, development and operation of computer systems.

Specific Skills Basic Training

B14. know the theoretical fundamentals of programming and use of practical methods and programming languages ​​for the development of software systems.

B16. Knowing the basics of the architecture of computers and servers, as well as the principles of operating systems.

Specific Skills Computer Engineering

IN15. Understand the fundamentals of the bases of unstructured data (documentaries and multimedia) and related techniques of classification and information retrieval.

IN36. Know and understand the principles of different methods and architectures of multimedia information, and be able to apply the most appropriate for each problem.

Specific Skills Common to the branch of Telecommunications

T7. Ability to learn and use architecture and design methodology, verification and validation of software.

T8. Ability to perform real programming, concurrent, distributed, event-driven time as well as the design of human-computer interfaces.

Competencies specific technology: Audiovisual Systems

AU37. Learn the basic techniques of data mining and web word and ability to apply them to concrete problems.

AU39. Mastery of advanced techniques of intelligent information search on the web.

 

Assessment

To pass the course it is necessary to pass the exam at the end of the course and pass five lab exercises. These exercises will be reviewed and graded by the lab teachers during the lab sessions. Lab sessions are held in groups of 3 students.

By the end of the semester there will be an exam covering all materials from the whole semester, and a minimum grade of 5 is required. The final grade for the course is the weighted sum of three parts:

- Seminars, 10% not recoverable.

- Lab, 30% not recoverable.

- Final exam 60% recoverable in July.

To pass the course, you must pass each of the three parts separately

 

Contents

Content blocks

1. Information retrieval concepts

2. Text retrieval indexes

3. Web search engines

4. Applications of retrieval technology

 

Content block 1.- Information retrieval concepts

ConceptsProceduresAttitudes

1. Models

2. Text and query processing

3. Evaluation

1. Programming

1. Clarity and neatness in the execution of the lab exercises

Content block 2.- Indexes

ConceptsProceduresAttitudes
1. Building
2. Search

1. Programming

1. Clarity and neatness in the execution of the lab exercises

Content block 3.- Web search engines

ConceptsProceduresAttitudes

1. Architecture of Web search engines

2. Hierarchy of Web pages

3. Scalability

1. Programming

2. Advanced use of Web search engines

 

1. Clarity and neatness in the execution of the lab exercises

Content block 4.- Applications of retrieval technology

ConceptsProceduresAttitudes

1. Multimedia

2. Structured text in XML

3. Specific domains

1. Problem solving in class

1. Active participation in the seminars

 

Methodology

The theory sessions, all in the large group, introduce the basic theoretical concepts and show the appropriate procedures for solving problems. The seminar sessions will discuss the applications of the introduced theoretical concepts. Laboratory sessions will be dedicated to programming the different elements of a search engine. The goal is twofold: on the one hand students should understand and consolidate the theoretical concepts, and on the other hand they will serve as indicators for evaluating the achievement of competencies related to search technologies. Work outside the classroom basically consists in finding additional information, solving proposed problems, preparing lab exercises and previous studying

 Activities in the classroomActivities outside the calssroomEvaluation
TopicsLarge groupLabSeminar Preparation of labsPersonal study and problem solvingExam

 Introduction

 2

 

 

 

 

 

1. Information retrieval concepts

 6

 2

 

4

12

 

2. Indexes  4  4    8 8  
3. Web search engines  6  4  2  8 12  
4. Applications      6   9  
Evaluation           3

Total:

 18

10

8

20

41

3

Total: 100

 

Resources

Sources of information for learning. Textbooks (paper and partially digital)

· BAEZA-YATES, Ricardo; RIBEIRO-NETO, Berthier: Modern Information Retrieval, Second Edition. Addison-Wesley, 2011. Website: www.mir2ed.org.

 

Sources of information for learning. Further reading (digital)

· MANNING, Chris; Raghavan, Prabhakar; Schütze, Heinrich: Introduction to Information Retrieval, Cambridge University Press. 2008. Website: www.informationretrieval.org