Sahar Al Seesi

Computer Science Department

Smith College

Office: BASS 106
Email: salseesi at smith dot edu

Home | Teaching | Students | Publications | Software | CV

I am a Visiting Assistant Professor in the Computer Science Department at Smith College. My research interest is in bioinformatics. I work on developing algorithms to analyze next-generation sequencing data. My research addresses various interrelated problems, including RNA transcriptome reconstruction, allele specific isoform expression estimation, gene differential expression and cancer specific variant identification and neo-epitope prediction. Tackling each of these problems requires employing a combination of computational approaches such as machine learning, grammatical modeling, statistical analysis, dynamic programming and expectation maximization algorithms.

Before joining Smith College, I worked as an Assistant Professor in Residence at the Computer Science Department in the University of Connecticut.

I live with my husband and three sons in Bloomfield, Connecticut. I immensely enjoy the changing scenery across the seasons during my daily commute to Smith College.


Home | Teaching | | Students | Publications | Software | CV


I am always happy to hear from motivated students looking to get involved in research. If you are interested, please feel free to contact me.

Home | Teaching | | Students | Publications | Software | CV



IsoEM2 infers isoform and gene expression levels from high-throughput transcriptome sequencing (RNA-Seq) data. IsoEM2 uses an Expectation-Maximization (EM) algorithm based on a probabilistic model that takes into account the fragment length distribution, with mean/standard deviations specified by the user or automatically inferred when using paired-end reads. The current version (IsoEM2) generates bootstrap-based confidence intervals for the TPM/FPKM estimates and is distributed along with the IsoDE2 package for performing bootstrap-based differential expression analysis.


IsoDE2 (distributes as part of the IsoEM2 package) performs differential gene and isoform expression analysis for RNA-Seq data both with and without replicates. IsoDE is based on bootstrapping, to compensate for lack of replicates. IsoDE2 relies on IsoEM2, an accurate expectation-maximization algorithm for gene/isoform level estimation that performs fast in-memory bootstrapping.


GeNeo is a suite of tools for predicting neo-epitopes from matched normal and tumor human exome sequencing data coupled with tumor transcriptome sequencing to identify the epitopes expressed in the tumor. One of the main tools in this suite is a somatic variant calling pipeline that makes use of cross sequencing platforms data, multiple somatic callers, and SNV validation steps using targeted single cell sequencing. Epitope calling is done through a tool that uses NetMHC 4.0. The tools are accessible through easy to use graphical user interfaces available on the Galaxy platform.


Epi-Seq is a multi-step bioinformatics analysis pipeline that starts from the raw RNA-Seq tumor reads, and produces a set of predicted tumor-specific expressed epitopes. It integrates several bioinformatics tools, including SNVQ for calling SNVs from RNA-Seq and NetMHC 3.0 for epitope prediction.

Home | Teaching | | Students | Publications | Software | CV