Well I guess it’s time for another english post (finally) — with the intention to reach a broader audience and the hope for feedback on my ideas. In this post I want to show a first inside on my PhD topic, since it’s getting really „serious“ now (again finally!) and I hope for any comments at this early stage of my work.
As you might know, I work in the project „eduComponents“ at the department of computer science of the OvGU. The eduComponents are a collection of components for the open source CMS Plone aiming at enhancing Plone in that way, that Plone can be used as a learning management system. This approach has many advantages like everything’s document-based and you don’t have to care for things like user and rights management. I don’t want to elaborate on these points here, but if you have any questions or comments feel free to contact me or browse our project’s publications.
The eduComponents include components for class management (ECLecture), multiple-choice tests (ECQuiz), assignment submission (ECAssignmentBox) with automatic testing for program code (ECAutoAsssessmentBox and ECSpooler) and a component for peer review of programming assignments (ECReviewBox). These components have been developed since 2005 and are used for courses at our workgroup and other institutions around the world (components are released under the GPL). Credit for development and implementation mostly go to Mario and Michael (with help of numerous student assistants), my part here (and also my entry to the Plone world) was porting the components to Plone 3 (again with enormous help from Mario and students) and maintainance since winter term 2008/2009.
Well, this has nothing to do with my PhD work in the first place, but I thought this is a good „place“ to say a public thank you
OK, so the situation is as follows: Students who attend our classes mostly have to hand in a number of (programming) assignments in order to get admission to write the exam for the course at the end of term. They are given tasks like „Write a Haskell function that computes the Fibonacci number for a given input.“ (yes, that is not that difficult, but for the sake of this post this example will do). Maybe that student has not been to the lecture where the professor talked about how to computer the Fibonacci number or he has no clue how to write a function in Haskell. So what does he do? Right, he searches for „Haskell“ and „Fibonacci“ at Google. Google will return approx. 7.5 million hits for „Haskell“ and approx. 2.3 million hits for „Fibonacci“. Of course he could refine his query to „haskell tutorial“ (which still yields to 285,000 hits) but all this takes time and doesn’t take into account, that the student maybe already knows how to program SML, which is a functional programming language, just like Haskell. How could Google know?
So here comes my idea into play. I want to develop and implement a recommender engine, that can offer a number of URLs to pages that might help the student to solve the recent assignment. These URL list is a tailor-made list for an individual user, since the recommendation engine takes into account what the learner already knows and which assignments he has already solved respectively. The user again can rate the links („This was helpful / not helpful“) which invokes a re-computation of the link list (and the score is stored for future reference). Let me show you an image that helps to illustrate my idea (click to enlarge):
The core is „eduSuggest“ – the recommendation engine. As input it takes a query Qo(„I search for HASKELL and FIBONACCI“). This query is sent by the frontend, in my case there is a Plone content type that generates this query from the assignment. The second input is a FOAF file describing the user profile. Why FOAF? Well it seems very promising to be able to do what I need. It’s an RDF vocabulary which I can mix with other RDF vocabularies and it’s XML, so good for further processing. I guess there will be other posts about using FOAF to model learners here in the future, I have to spend some thought on that.
OK, now we have the input and the learner’s profile, eduSuggest is doing some preprocessing to the query (e.g. lookup in an ontology to find related terms, e.g. „SML“ is also a functional programming language like „Haskell“) and sends the query Qp to associated sources like del.icio.us, digg and other social bookmarking services as well as Wikipedia or even Google. Which resources eduSuggest queries is freely configurable, but the intention is to use resources which contain a collection of collectively tagged and rated links. The results from the resources will be again processed by eduSuggest. This time the processing includes „shaping“ the query results according to the learner’s profile and a lookup in eduSuggest’s database to yield the URL’s scores (if any user rated that link before) plus an additional query for URLs that might have been stored in the service by users themselves. Then a ranking algorithm is applied – I am thinking of utilizing Bayesian Networks here („Which URL is most likely to satisfy the user?“). An individual list with links, that is shaped to the learner’s profile and the recent assignment is then returned. The learner can then still rate the links („Hey that link you gave me was not helpful!“) which again yields to a re-computation of the link list (i.e. the probability scores in the Bayesian network are changed and the net itself will be recomputed returning an altered linked list).
Well that’s roughly the main idea. Phew, I think that was the longest blog post I have ever written.
Comments, questions, anything? Thinking I am crazy, stupid, smart, <insert appropriate term here>? Feel free to drop a comment!
image credits: Piled Higher & Deeper