04.30.08

Patient Matching, The First Step

Posted in OpenMRS, Summer of Code tagged , , at 6:50 am by nribeka

My first phone discussion about my project with my mentor, Shaun Grannis and James Egg, went well. Shaun and James explain to me about the project in details and I think the project is really interesting. I made a couple of stupid questions that is not related to the project though, sorry for that Shaun and James hehe …

My first project is to implement a fully functional random sample analyzer that calculates the rate of random agreement among corresponding pairs of records between two data sources. This rate value will replace the u rate, field agreement rate among pairs that are truly non-matched, that come from the Expectation Maximization analyzer. To get a better overview about linkage process and rationale behind the process you should read this publication about record linkage. If you want to know more about the Expectation Maximization algorithm you can read the wiki or some other journals and publication.

The process for generating u value for each column are as follows:

  • Generate two arrays of Record with the desired size of maximum sampling size
  • Take one Record from each array at a time and do the following:
    • For each demographic data in the Record, match their value using selected String matching algorithm (Jaro-Winkler, Levenshtein, Longest Common Substring or Exact Match)
    • If the value from both Record match each other, then increment match rate of current demographic data.
  • Do over above process until all record have been paired and examined
  • Calculate the u value for each demographic data and set the new u value to the MatchConfig object.

I still need to dig more about the first process and see how each datasource is read and converted into Record object. What do you think about the above process? Did I miss anything?

04.25.08

Summer of Code 2008

Posted in OpenMRS, Summer of Code tagged , , at 2:28 am by nribeka

Woohoo yaaayyyyy … After delayed for a week, Google finally announce the list of accepted students for this year Google Summer of Code and guess what, I’m on the list. Wooohhhhoooo … I was so excited when I saw my name in the list which means I’m accepted as an intern student in OpenMRS for the 2008 Google Summer of Code (GSoC 2008) wooohhoooo yaaayyyy …

Open Medical Record System (OpenMRS) formed in 2004 is an open source medical record system framework for developing countries. For this year GSoC, OpenMRS have so many projects to offer to the students. I’ll be working to Extend Patient Matching Analyzers and Heuristics Support. This blog is dedicated to post any progress that I made on this project.

Right now, I’m still trying to understand current implementation of the patient matching in OpenMRS. My mentor, Shaun Grannis, give me some journals and publications to read in order to understand the matching algorithm. The main idea of the patient matching project is to enable aggregation patient’s data that is scattered in different places with different patient identifier. I think the work will be very interesting and challenging. Burke Mamlin, another mentor in OpenMRS, said that students will have fun working for OpenMRS and I think he’s right. The community is really fun and full of friendly people.