Posts Tagged ‘education’

Originality in research

22 July 2011 1 comment

We never think entirely alone: we think in company, in a vast collaboration; we work with the workers of the past and of the present. [In] the whole intellectual world … each one finds in those about him [or her] the initiation, help, verification, information, encouragement, that he [or she] needs.

— A. D. Sertillanges [11, p.145]

In a post called It’s academic, I used the phrase “original contribution to scholarship” to describe one of many important outcomes of PhD research. The phrase is usually synonymous with publications in peer-reviewed professional literature. Yes, having our research published in a research journal is an important outcome. It is one of many tangible outcomes of our candidature, besides a hard-bound thesis thick enough to be used as a paper weight and chock full of jargon to the extent of frightening away everyone except the brave souls of our professional society. But what about intangible outcomes? In this post, I start exploring some common intangible outcomes of a PhD education, drawing on the phrase “original contribution to scholarship” as a springboard for discussion. Think of this post as the first installment in the series of SOC posts: scholarship, originality, contribution, but not necessarily to be discussed in that order.

Prior art

First, I deal with the issue of originality in this installment. Originality can be described as something that has not been conceived of or done before by someone. With a vast body of research literature out there, how are we to know what we’re doing is original research? How do we know we’re not duplicating someone else’s published work? Short answer: it’s almost impossible to know for certain. There’s an apocryphal anecdote that John von Neumann came up with two solutions to two problems, but on both occasions afterward learnt that someone had already solved the problems. Another example is Gauss’ work. Gauss didn’t publish much of his mathematics research. In many cases, later works were found to have been rediscoveries of Gauss’ unpublished work. Or consider the calculus war [1], a priority dispute over whether Leibniz or Newton was the original inventor of calculus. One of our best guards against duplicating someone else’s work is to be aware of the general research trends in our chosen area and have a clear statement of our research questions. These two topics will be discussed in turn.

To know what’s happening in our area, we should start off by reading some research papers that we think are directly relevant to our project. A good starting point should be expository papers or Wikipedia articles, or publications providing high-level introductions to certain topics. The objective is to get a lay-person’s overview of our research area before delving into research papers. Survey papers generally provide technical overviews of particular research areas, with detailed analyses and discussions of salient theories, trends, topics, and techniques. A survey paper’s purpose is to be a guide to the vast body of research publications on a particular topic, often citing hundreds of relevant publications spread over dozens of journals and books. More often than not, a survey paper should be read as if it is a practitioner’s guide to recent research. Don’t expect every concept or technique to be clearly explained, or explained at all. A good strategy for reading survey papers is to have ready access to relevant textbooks and reference materials. In fact, we should find out whether there are any textbooks or reference books relevant to our research topic and read them as well. In short, reading survey papers, expository publications, and textbooks should equip us with a bird’s eye-view of previous work in our area.

Once we have a general (or vague) idea of what has been published in our research area, we should start writing a literature review. This is similar to writing a survey of previous published work in our area, with detailed discussions of relevant theories, trends, topics, and techniques. It is by means of writing a literature review that we acquaint ourselves with relevant technical details, and successes and limitations of published techniques. Think of the literature review as a survey paper based on a sample of publications most relevant to our research questions. As we write our literature review, we should also maintain a list of research questions together with high-level strategies to investigate them. There’s never enough time to write a thorough and comprehensive literature review. The best we could do is to incrementally add to our literature review as we progress along our research project. When the time comes to write our research proposal, we would have available our literature review and a list of research ideas from which to draw inspiration and material.

Originality in conduct

Thus far, I have discussed how to start doing original research by first being aware of prior research. I now turn to originality in terms of how we conduct or carry out our project. Cryer [2, pp.193–196] identified a number of broad approaches in which a project is seen to be original. The author also identified originality in terms of a project’s outcomes, but I won’t discuss that point here. A project can be original in terms of: (1) tools, techniques, procedures, and methods; (2) exploring an unknown; (3) exploring something unanticipated; or (4) using data in a novel way. These will be discussed in turn.

First, our project could be original in terms of developing new tools, techniques, procedures or methods, or applying these to an existing problem or to a context where they have not been tried before. Regardless of the success or failure of such investigations, we would more often than not discover cases when a new approach yields new or insightful results, and situations where the approach sheds no light at all. Often, it is not enough to develop a computer program implementing a new algorithm or technique; we need to show how that program could be used to shed new understanding on a problem.

Second, we could venture into an unexplored area and in the process discover a new field of research. Examples of such investigations into the unknown include Hardin’s “tragedy of the commons” [9], Diffie and Hellman’s development of public-key cryptography [3], Haar’s introduction of wavelets [8], and Euler’s introduction of graph theory [5]. It’s almost rare to develop a new field from scratch, and even if we do so it might take years or decades or even longer before anyone realizes the new field’s full potentials. We should also be aware that scientific research is a competitive enterprise with the attendant occasional priority disputes of who did what before whom.

Third, while investigating a problem we might serendipitously come across an unexpected result, phenomenon, or research direction. Examples of fortuitous discoveries include Louis Pasteur’s discovery of the cholera vaccine, Oskar Minkowski’s discovery that diabetes results from a pancreas disorder, and Alexander Fleming’s discovery of the enzyme lysozyme. In some cases such as the above, it is worth investigating the unexpected, but in other cases we should be cautious about investigating something that might lead down a dead end. Exploring the unanticipated can also be seen in research that lends a new perspective on an existing field, such as applying the notion of the tragedy of the commons to information and communications technology [7]. Much fruitful research falling under the rubric of exploring the unanticipated also consists in applying existing tools, techniques, or methods to problems that have already enjoyed much attention. An example is the application of graph theory to analyzing social, technological, biological, and other types of networks [4].

Fourth, an existing dataset could be explored in a novel way. Among other things, this includes interpreting the dataset differently from how it has been interpreted in the existing research literature, or applying new techniques to process the dataset. The task of collating a new dataset itself is an original research project often involving a team of researchers. At every step of the project, we need to adhere to established data collection protocols. Once the dataset is finally assembled, there are protocols specifying how to cleanse the dataset, how to code the data, etc. Examples of original research one of whose primary goals was the collation of a new dataset include the human genome project, the Facebook dataset project [10], the C. elegans nervous system project [12], and the modENCODE Project [6].


[1] J. Bardi. The Calculus War: Newton, Leibniz and the Greatest Mathematical Clash of All Time. High Stakes Publishing, 2007.

[2] P. Cryer. The Research Student’s Guide to Success. Open University Press, 3rd edition, 2006.

[3] W. Diffie and M. Hellman. New directions in cryptography. IEEE Transactions on Information Theory, 22(6):644–654, 1976.

[4] D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning about a Highly Connected World. Cambridge University Press, 2010.

[5] L. Euler. Solutio problematis ad geometriam situs pertinentis. Comment Acad. Sci. Imperialis Petropolit, 8:128–140, 1736.

[6] M. B. Gerstein, Z. J. Lu, E. L. V. Nostrand, C. Cheng, B. I. Arshinoff, T. Liu, K. Y. Yip, R. Robilotto, A. Rechtsteiner, K. Ikegami, P. Alves, A. Chateigner, M. Perry, M. Morris, R. K. Auerbach, X. Feng, J. Leng, A. Vielle, W. Niu, K. Rhrissorrakrai, A. Agarwal, R. P. Alexander, G. Barber, C. M. Brdlik, J. Brennan, J. J. Brouillet, A. Carr, M.-S. Cheung, H. Clawson, S. Contrino, L. O. Dannenberg, A. F. Dernburg, A. Desai, L. Dick, A. C. Dose, J. Du, T. Egelhofer, S. Ercan, G. Euskirchen, B. Ewing, E. A. Feingold, R. Gassmann, P. J. Good, P. Green, F. Gullier, M. Gutwein, M. S. Guyer, L. Habegger, T. Han, J. G. Henikoff, S. R. Henz, A. Hinrichs, H. Holster, T. Hyman, A. L. Iniguez, J. Janette, M. Jensen, M. Kato, W. J. Kent, E. Kephart, V. Khivansara, E. Khurana, J. K. Kim, P. Kolasinska-Zwierz, E. C. Lai, I. Latorre, A. Leahey, S. Lewis, P. Lloyd, L. Lochovsky, R. F. Lowdon, Y. Lubling, R. Lyne, M. MacCoss, S. D. Mackowiak, M. Mangone, S. McKay, D. Mecenas, G. Merrihew, D. M. Miller III, A. Muroyama, J. I. Murray, S.-L. Ooi, H. Pham, T. Phippen, E. A. Preston, N. Rajewsky, G. Ratsch, H. Rosenbaum, J. Rozowsky, K. Rutherford, P. Ruzanov, M. Sarov, R. Sasidharan, A. Sboner, P. Scheid, E. Segal, H. Shin, C. Shou, F. J. Slack, C. Slightam, R. Smith, W. C. Spencer, E. O. Stinson, S. Taing, T. Takasaki, D. Vafeados, K. Voronina, G. Wang, N. L. Washington, C. M. Whittle, B. Wu, K.-K. Yan, G. Zeller, Z. Zha, M. Zhong, X. Zhou, modENCODE Consortium, J. Ahringer, S. Strome, K. C. Gunsalus, G. Micklem, X. S. Liu, V. Reinke, S. K. Kim, L. W. Hillier, S. Henikoff, F. Piano, M. Snyder, L. Stein, J. D. Lieb, and R. H. Waterston. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE Project. Science, 330(6012):1775–1787, 2010.

[7] G. M. Greco and L. Floridi. The tragedy of the digital commons. Ethics and Information Technology, 6(2):73–81, 2004.

[8] A. Haar. Zur theorie der orthogonalen funktionensysteme. Mathematische Annalen, 71(1):38–53, 1911.

[9] G. Hardin. The tragedy of the commons. Science, 162(3859):1243–1248, 1968.

[10] K. Lewis, J. Kaufman, M. Gonzalez, A. Wimmer, and N. Christakis. Tastes, ties, and time: A new social network dataset using Social Networks, 30(4):330–342, 2008.

[11] A. D. Sertillanges. The Intellectual Life: Its Spirits, Conditions and Methods. Mercier Press, 1978. Translated by Mary Ryan.

[12] J. G. White, E. Southgate, J. N. Thompson, and S. Brenner. The structure of the nervous system of the nematode Caenorhabditis elegans. Philosophical Transactions of The Royal Society B, 314(1165):1–340, 1986.


Research projects

9 July 2011 1 comment

Following on from a previous post that explored the notion of “academic” in academic projects, I now turn to the keyword “project”. In the Meliorist Model, a project is a set of actions leading from a current state to a desired goal. Within the context of academic projects, the existing situation is the current state of knowledge in a discipline. The desired goal is to extend the current state of knowledge in the discipline. The set of actions encompasses everything in the research process, or at least anything that directly relates to our goal. It’s almost a certainty that we would need to conduct a literature review. How else would we bring ourselves up-to-date with the state of the art in our chosen area? Like death and taxes, we can’t escape the project proposal. Sooner or later, we need to draft a project proposal outlining:

  • our aims and research questions;
  • previous work relating to what we are proposing to research (this is where a thorough literature review comes in handy);
  • justify the significance of the research and how it addresses an important problem;
  • outline how we are to tackle the problem, e.g. via a conceptual framework, theoretical methodology, or experimental approach;
  • any practical applications arising from results of our research;
  • and a project plan outlining milestones and anticipated timelines.

And a myriad of other activities having direct bearing on our progress, such as productive procrastination by reading PhD comics. Well, maybe not the last one, but it does help to relax after a long day spent working on our project.

A research project, indeed any project, is a sequence of considered activities taking place within a specified time frame and whose purpose is to make an original contribution to scholarship. I say “sequence of considered activities” because most important activities in the project should not be undertaken in an ad hoc manner, but rather as one planned activity following another. In other words, it is important to set milestones and plan our research activities accordingly. We often don’t have a say in the setting of hard deadlines, such as the date on which our candidature confirmation is to take place, the date on which to submit a draft of our thesis for examination, and the date for submitting the final draft of the thesis. However, for anything else wherein we do have direct influence, it is crucial to set goals and work toward achieving those goals.

Finally, a research project consumes resources such as time and money. Time usually means the amount of time we invest into the project each week, but it also means the time for regular scheduled meetings with our project supervisor(s), and so on. Money refers to any financial support we receive while working on our project. Financial support can range from money to buy equipment and software to living stipend. Let’s face it: we have to eat and money for food must come from somewhere. For one reason or another, many research students are not eligible to receive living stipends or scholarships. A common consequence is that they would need to seek part-time or casual work to get some money for food, accommodation, transportation, and other expenses. If you know how and where to get free (and nutritious) food, I’m all ears. Resources such as time and money are finite, and should be used wisely in the case of time and sparingly as regards money.

It’s academic

3 July 2011 2 comments

This post is a follow-up to a previous post entitled “Undergraduate projects”. Here, I want to explore the question of what renders a computing project academic. Many degrees at the undergraduate and postgraduate levels require a research component as part or all of the degrees. Examples include Honours, Master’s, MPhil, and PhD. My primary focus here is on the PhD. As a vehicle for understanding academic projects at the research degree level, we need to wrap our heads around the twin concepts of “academic” and “project” in the context of research degrees. To this end, I will discuss these two concepts in turn. The concept of “academic” will be covered in this post, while “project” will be dealt with in a future post. Along the way, I want to tease out how these concepts relate to requirements common across all PhD degrees.

First, let’s explore what is meant by the word “academic”. The word is commonly used to designate academic staff members, meaning the lecturers and professors of a university or research institute. However, in the context of a research degree, “academic” is used as an adjective to describe the quality of a piece of work rather than as a noun designating a hired staff member. Among the hallmarks of an academic project at the postgraduate level are an opportunity to showcase our ability to formulate a problem, our deep or intimate understanding of the problem, our skills in researching a solution to the problem, and our final presentation of the problem and any solution we propose. Overall, an academic project isn’t an academic project unless it taxes our critical thinking and research skills.

At the postgraduate level, academic projects are more commonly known as research projects. In university systems within the Commonwealth and Europe, research projects represent all of a postgraduate degree such as an MPhil or a PhD. Some authors call this the “classical model of PhD” [2, p.5]. There is yet another model of PhD requiring a research component: the “taught PhD model” [2, p.5] as practiced in the USA, where a PhD candidate must first undergo coursework assessable by a general examination and the second stage is where the candidate carry out their research projects. (We also have Professional Doctorates and Honorary Doctorates and doctors of the medical variety, but I won’t discuss them here.)

Any PhD model with a substantial research component usually has a number of elements in common. Through the medium of a research project, we as PhD candidates have an opportunity to draw together skills and knowledge gained during our undergraduate studies to bear on a (hopefully challenging) long-term and sustained project, normally lasting about three years. We are not expected to know everything prior to commencing our research project; that’s practically impossible, not to mention absurd. On the contrary, throughout the research project we are afforded opportunities to develop new technical, personal, and communication skills. Being PhD candidates also means that our research projects are primarily our own responsibilities. In the end, we are the ones who work on our projects and contribute original ideas. But most importantly, the outcome of a PhD research project is an original contribution to scholarship, often in the form of publications in research literature.

During our candidature, we are expected to demonstrate a number of qualities including (this list is taken from [1, p.10] with minor amendments):

  • an ability to work independently with minimum supervision;
  • an ability to draw on existing knowledge and identify additional knowledge needed for our study;
  • an ability to critically evaluate advanced literature (journal papers);
  • “an ability to conceive original ideas”;
  • an ability to plan our work effectively;
  • an ability to select and use appropriate hardware, software, tools, methods, and techniques;
  • an ability to present our work effectively in written and oral forms;
  • an ability to critically evaluate our own work and justify all aspects of it;
  • an ability to identify areas of further research in our chosen area.

Research degrees, especially PhD, particularly emphasize our ability to generate original ideas. A PhD candidate is expected to make a substantial contribution to scholarship, not merely rehashing what others have done.


[1] C. W. Dawson. Projects in Computing and Information Systems: A Student’s Guide. Pearson Education Limited, 2nd edition, 2009.

[2] P. Dunleavy. Authoring a PhD: How to plan, draft, write and finish a doctoral thesis or dissertation. Palgrave MacMillan, 2003.

Undergraduate projects

26 June 2011 1 comment

In this post, I set the stage for a discussion of the question: What is an academic project in the field of computing? This question will not be dealt with in this post, but possibly a future post. For an understanding of what constitutes an academic computing project, we should think back to the various projects we worked on during our undergraduate years. As part of its assessment regime, a study unit might require that we work on a project for a few weeks. Such projects are usually small and their primary objectives are to test our understanding of a subject matter. A project might be largely about reading up on a particular topic and write a small piece of software in a chosen programming language.

For example, in my third year undergraduate cryptography unit, the class was given an individual project (translated as “work on your own, no team work”) to implement a few cryptosystems using Maple. Each student was given the choice to implement three cryptosystems from among a pool of cryptosystems. I can’t remember what I did exactly, but I vaguely recall that I implemented the Hill cipher. This project tested our skills in translating mathematical ideas to computer programs, our ability to document the software we wrote, and our familiarity with other good software engineering principles. Despite the unit being a semester’s worth of third year undergraduate cryptography, that mini project afforded students an opportunity to showcase not only good software engineering practices but also their ability to work individually.

Some undergraduate projects require team work. I remember a third year unit that spanned two semesters, called “Industry Project”. At the start of semester one, students allocated themselves into teams of four and the teams more or less stay the same throughout the two semesters. Each team then chose a project with an industry partner and, as you would have guessed by now, the team would work with the industry partner throughout the two semesters to design and implement something (e.g. a software system) for the partner. The primary objectives of this year-long project was to develop our ability to work as a team, our inter-personal communication skills, and our skills in liaising with stakeholders in the project. Our final marks took into account the individual work each student contributed to their project and also the finished product the teams delivered at the end of the second semester.

Do any of the projects I mentioned above qualify as an academic project? The short answer is “No”. The cryptography project above is more or less a run of the mill project: read up on an existing concept and implement it as a computer program. The existing concept would very likely have had its details worked out in the cryptography or mathematics literature, e.g. the papers [1,2] in the case of the Hill cipher. The “Industry Project” is no more academic than many projects that software engineers or programmers are hired to work on. Projects at the undergraduate level usually test our understanding of a subject matter. These allow us to showcase our ability to apply skills gained during previous years to bear on practical problems.


[1] L. S. Hill. Cryptography in an algebraic alphabet. The American Mathematical Monthly, 36(6):306–312, 1929.

[2] L. S. Hill. Concerning certain linear transformation apparatus of cryptography. The American Mathematical Monthly, 38(3):135–154, 1931.

Bubbles and gullibility

18 June 2011 Leave a comment

Odlyzko [1] proposed a measure of gullibility, called the gullibility index, as a quantitative tool for developing realistic economic models. He argued that gullibility and innumeracy are strongly correlated, where innumeracy is understood to mean “the inability to reason with numbers and other mathematical concepts”. For example, answer the following question. What weighs more: a tonne of bricks or a tonne of feathers? Answer: both are of the same weight since each is a tonne. That we are comparing bricks with feathers is irrelevant. Innumeracy is almost universal in the sense that even people with higher degrees such as PhDs and MBAs do show signs of innumeracy. Furthermore, people who exhibit high degrees of numeracy could also fall prey to suspiciously false quantitative stories. A case in point is John Allen Paulos who related in [2] his falling victim to the Internet bubble of the early 2000s.


[1] A. Odlyzko. Bubbles, gullibility, and other challenges for economics, psychology, sociology, and information sciences. First Monday, 15(9), 2010.

[2] J. A. Paulos. A Mathematician Plays the Stock Market. Basic Books, 2003.

Version 0.7 of book “Algorithmic Graph Theory” released

24 February 2011 4 comments

Here is version 0.7 of the book Algorithmic Graph Theory. The relevant download options are:

Version 0.7 fleshes out the chapter “Random Graphs”. Here is the content of the chapter in brief:

  1. Network statistics
  2. Binomial random graph model
  3. Erdos-Renyi model
  4. Small-world networks
  5. Scale-free networks

Version 0.6 of book “Algorithmic Graph Theory” released

6 January 2011 Leave a comment

Happy new year, folks! As a new year’s gift to you, here is version 0.6 of the book Algorithmic Graph Theory. The relevant download options are:

Version 0.6 adds the new chapter “Tree Data Structures” that discusses priority queues and various efficient implementations of priority queues, including binary heaps and binomial heaps. Here is the content of the new chapter in brief:

  1. Priority queues
  2. Binary heaps
  3. Binomial heaps
  4. Binary search trees