Document Type

Journal Article

Department/Unit

Department of Mathematics

Title

MultiFacTV: Module detection from higher-order time series biological data

Language

English

Abstract

BACKGROUND: Identifying modules from time series biological data helps us understand biological functionalities of a group of proteins/genes interacting together and how responses of these proteins/genes dynamically change with respect to time. With rapid acquisition of time series biological data from different laboratories or databases, new challenges are posed for the identification task and powerful methods which are able to detect modules with integrative analysis are urgently called for. To accomplish such integrative analysis, we assemble multiple time series biological data into a higher-order form, e.g., a gene × condition × time tensor. It is interesting and useful to develop methods to identify modules from this tensor. RESULTS: In this paper, we present MultiFacTV, a new method to find modules from higher-order time series biological data. This method employs a tensor factorization objective function where a time-related total variation regularization term is incorporated. According to factorization results, MultiFacTV extracts modules that are composed of some genes, conditions and time-points. We have performed MultiFacTV on synthetic datasets and the results have shown that MultiFacTV outperforms existing methods EDISA and Metafac. Moreover, we have applied MultiFacTV to Arabidopsis thaliana root(shoot) tissue dataset represented as a gene×condition×time tensor of size 2395 × 9 × 6(3454 × 8 × 6), to Yeast dataset and Homo sapiens dataset represented as tensors of sizes 4425 × 6 × 6 and 2920×14×9 respectively. The results have shown that MultiFacTV indeed identifies some interesting modules in these datasets, which have been validated and explained by Gene Ontology analysis with DAVID or other analysis. CONCLUSION: Experimental results on both synthetic datasets and real datasets show that the proposed MultiFacTV is effective in identifying modules for higher-order time series biological data. It provides, compared to traditional non-integrative analysis methods, a more comprehensive and better view on biological process since modules composed of more than two types of biological variables could be identified and analyzed.

Publication Date

2013

Source Publication Title

BMC Genomics

Volume

14(Suppl)

Issue

S2

Start Page

1

End Page

11

Publisher

BioMed Central

DOI

10.1186/1471-2164-14-S4-S2

Link to Publisher's Edition

http://dx.doi.org/10.1186/1471-2164-14-S4-S2

ISSN (print)

14712164

ISSN (electronic)

14712164

This document is currently not available here.

Share

COinS