by Reuben Shamir
Working in close-collaboration with experienced and highly skilled software-engineers, I’ve learned new valuable tools and approaches that are relevant to any lab in the fields of medical image processing and computer aided surgery. Here I’ve taken the opportunity to share what I learned.
1) Python is great for computer-aided surgery R&D
Python is a widely used high-level programming language. More important, it has all you will probably need for research and development in the fields of medical image processing and computer aided surgery. I found the language easy to learn and there are many references and helpful examples on the web. In the past year, all research and development at our company was done with Python. This allowed us to move quickly from research to product development, monitoring the software start-to-end, and have good communication with the product engineers that know the language well. Here are some relevant Python packages:
SciPy (https://www.scipy.org/): SciPy is a Python-based open-source multi-package software for mathematics, science, and engineering. Specific packages that I find useful are: NumPy that is the fundamental package for scientific computing with Python; SciPy library that provides many user-friendly and efficient numerical routines such as routines for numerical integration and optimization; matplotlib, which is a python 2D plotting library that produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms, and; pandas – providing high-performance, easy-to-use data structures and data analysis tools.
SimpleITK (http://www.simpleitk.org/): ITK (Insight Segmentation and Registration Toolkit) is an open-source, cross-platform system that provides developers with an extensive suite of software tools for image analysis. Among them, SimpleITK is a simplified layer built on top of ITK, intended to facilitate its use in rapid prototyping, education, interpreted languages such as Python.
VTK (http://www.vtk.org/): The Visualization Toolkit (VTK) is an open-source, freely available software system that supports Python for 3D computer graphics, modeling, image processing, volume rendering, scientific visualization, and information visualization.
There are many options for Python integrated development environment (IDE; see https://wiki.python.org/moin/IntegratedDevelopmentEnvironments). Personally, I find PyCharm (https://www.jetbrains.com/pycharm/) especially convenient and comprehensive. It also integrates well with Git version control software (see below) and assists with code merging and conflict solving.
2) Cloud computing services
Cloud computing is very common in industry today. Within a few minutes, it is possible to set-up a cluster and run dozens of computationally intensive tasks. It allowed us to cut the computation time by half and to scale-up quickly when needed. Cloud computing can also be useful for research labs in case of approaching deadlines for example or other time constraints. The costs are affordable (10-20 USD/day for a server of 16 cores, 60 GB RAM and 700GB SSD) and many providers will charge only when the machines are up/running.
3) Agile management
Agile project management had transformed the way software engineers develop products today. Generally speaking, agile software development cultivates healthy habits such as open and continuous communication, readiness to change, and working on short-term tasks while constantly evaluating overall progress and bringing up suggestions for improvement. The scope of agile management extends beyond the specific field of software development and can improve the research as well.
Here are few pointers: 1) A good starting point on this subject can be the “agile manifesto” and the “twelve principles of agile software” (http://agilemanifesto.org/); 2) Scrum. Scrum is a widely used framework for agile management of software development (https://www.scrumalliance.org/); 3) SCORE. SCORE is a modification of Scrum and the agile approach for managing research (http://www.cs.umd.edu/~mwh/papers/score.pdf), and; 4) An article on extending the agile approach to the whole organization (Harward Business Review) https://hbr.org/2014/11/bring-agile-to-the-whole-organization.
4) Code version control
Wow, how did I work without it!? Code version software facilitates backup and tracking of code changes. It also enables collaboration and team-work. For example, let’s say you have developed a new segmentation method. After some time, you have an idea to make it faster. No problem, make a new branch and implement your ideas. If it is not working well just delete or ignore this branch. If it does work well, merge it to the main code. It is still possible to go back and run the slower version if needed. Three years later undergraduate students need this code for their project? Two students work on different aspects of this module? Contemporary code versioning software offers solutions for these and similar situations.
Git is probably a good first starting point (https://git-scm.com/). Graphical user interfaces can simplify the monitoring of the software and its management. Git references some GUI options at https://git-scm.com/downloads/guis.
5) Continuous documentation of research
Research notebooks are often a good idea as published manuscripts cover only part of the information needed to run a successful experiment. Data-preparation, post-computation step, exact parameter values, and the development environment are just a few of the factors that may have a major effect on the results, but are too-often not well documented. Moreover, failed experiments can be documented and assist to better understand the limits of the method and improve it later on. In addition, the environment setup can also be documented to ensure repeatability of results. This can help new lab members continuing an old project get up and running in a short time.
Here are some pointers: 1) sciNote is an electronic lab notebook which helps organization of scientific data and safely stores it all in one place (http://scinote.net/); 2) Confluence. In our company we use a development-oriented collaboration-software named Confluence (by Atlassian; https://www.atlassian.com/software/confluence); Confluence is a service in which team members can discuss work, record decisions, comment on documents, and otherwise collaborate as a team. When new team members come on board, Confluence gives them context and history about both the projects at hand and the team itself; 3) Jupyter Notebook. Another related software that I find useful in this category is the Jupyter Notebook (http://jupyter.org/). Jupiter is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.
Lastly, a nice feature of Python is that it is possible to save the entire environment in which the experiment was conducted and install it all on another computer or virtual environment if needed. This ensures that all the service routines are exactly the same as those used for the original experiment.