no code implementations • 28 Mar 2024 • Sarwan Ali, Prakash Chourasia, Murray Patterson
This study introduces a novel approach, combining substruct counting, $k$-mers, and Daylight-like fingerprints, to expand the representation of chemical structures in SMILES strings.
no code implementations • 12 Feb 2024 • Sarwan Ali, Tamkanat E Ali, Prakash Chourasia, Murray Patterson
In this work, we present a novel approach based on the compression-based Model, motivated from \cite{jiang2023low}, which combines the simplicity of basic compression algorithms like Gzip and Bz2, with Normalized Compression Distance (NCD) algorithm to achieve better performance on classification tasks without relying on handcrafted features or pre-trained models.
no code implementations • 1 Feb 2024 • Sarwan Ali, Pablo Moscato
We present a memetic algorithm (\maa) approach for finding a Hamiltonian cycle in a Hamiltonian graph.
no code implementations • 20 Sep 2023 • Sarwan Ali
By combining extrinsic evaluation methods, such as classification and clustering, with t-SNE-based neighborhood analysis, such as neighborhood agreement and trustworthiness, we provide a comprehensive assessment of the representation capacity.
no code implementations • 15 Jul 2023 • Usama Sardar, Sarwan Ali, Muhammad Sohaib Ayub, Muhammad Shoaib, Khurram Bashir, Imdad Ullah Khan, Murray Patterson
We curated a comprehensive dataset of Nanobody-Antigen binding and nonbinding data and devised an embedding method based on gapped k-mers to predict binding based only on sequences of nanobody and antigen.
1 code implementation • 14 Jul 2023 • Muhammad Sohaib Ayub, Naimat Ullah, Sarwan Ali, Imdad Ullah Khan, Mian Muhammad Awais, Muhammad Asad Khan, Safiullah Faizullah
We propose Context-Aware Metric of player Performance, CAMP, to quantify individual players' contributions toward a cricket match outcome.
no code implementations • 8 Jun 2023 • Mansoor Ahmed, Usama Sardar, Sarwan Ali, Shafiq Alam, Murray Patterson, Imdad Ullah Khan
The proposed BAE framework provides a new approach for estimating brain age, which has important implications for the understanding of neurological disorders and age-related brain changes.
1 code implementation • 25 Apr 2023 • Zahra Tayebi, Sarwan Ali, Prakash Chourasia, Taslim Murad, Murray Patterson
Sparse coding is a popular technique in machine learning that enables the representation of data with a set of informative features and can capture complex relationships between amino acids and identify subtle patterns in the sequence that might be missed by low-dimensional methods.
no code implementations • 24 Apr 2023 • Sarwan Ali, Babatunde Bello, Prakash Chourasia, Ria Thazhe Punathil, Pin-Yu Chen, Imdad Ullah Khan, Murray Patterson
Understanding the host-specificity of different families of viruses sheds light on the origin of, e. g., SARS-CoV-2, rabies, and other such zoonotic pathogens in humans.
no code implementations • 13 Apr 2023 • Sarwan Ali, Taslim Murad, Murray Patterson
Therefore, the usage of only the spike protein, instead of the full genome, provides most of the essential information for performing analyses such as host classification.
1 code implementation • 6 Apr 2023 • Sarwan Ali, Prakash Chourasia, Zahra Tayebi, Babatunde Bello, Murray Patterson
In this work, we propose \emph{ViralVectors}, a compact feature vector generation from virome sequencing data that allows effective downstream analysis.
no code implementations • 1 Apr 2023 • Sarwan Ali, Usama Sardar, Murray Patterson, Imdad Ullah Khan
Kernel-based methods, e. g., SVM, are a proven efficient and useful alternative for several machine learning (ML) tasks such as sequence classification.
no code implementations • 4 Mar 2023 • Taslim Murad, Sarwan Ali, Murray Patterson
New tools for biological sequence analysis are provided by machine learning (ML) technologies to effectively analyze the functions and structures of the sequences.
no code implementations • 17 Feb 2023 • Prakash Chourasia, Taslim Murad, Zahra Tayebi, Sarwan Ali, Imdad Ullah Khan, Murray Patterson
This paper presents a federated learning (FL) approach to train an AI model for SARS-Cov-2 variant classification.
no code implementations • 1 Feb 2023 • Sarwan Ali, Prakash Chourasia, Murray Patterson
Anderson acceleration (AA) is a well-known method for accelerating the convergence of iterative algorithms, with applications in various fields including deep learning and optimization.
no code implementations • 19 Nov 2022 • Sarwan Ali
Similarly, euclidean space is not considered the best choice when working with the classification and clustering tasks for biological sequences.
1 code implementation • 16 Nov 2022 • Prakash Chourasia, Sarwan Ali, Murray Patterson
We show that by using different techniques, such as informed initialization and kernel matrix selection, that t-SNE performs significantly better.
no code implementations • 15 Nov 2022 • Prakash Chourasia, Sarwan Ali, Simone Ciccolella, Gianluca Della Vedova, Murray Patterson
As a result, new methods such as Pangolin, which can scale to the millions of samples of SARS-CoV-2 currently available, have appeared.
1 code implementation • 1 Nov 2022 • Haris Mansoor, Sarwan Ali, Shafiq Alam, Muhammad Asad Khan, Umair ul Hassan, Imdadullah Khan
In this paper, we analyze the effect on fairness in the context of graph data (node attributes) imputation using different embedding and neural network methods.
no code implementations • 11 Sep 2022 • Sarwan Ali, Bikram Sahoo, Muhammad Asad Khan, Alexander Zelikovsky, Imdad Ullah Khan, Murray Patterson
More specifically, we improve the quality of the approximate kernel using domain knowledge (computed using information gain) and efficient preprocessing (using minimizers computation) to classify coronavirus spike protein sequences corresponding to different variants (e. g., Alpha, Beta, Gamma).
no code implementations • 27 Jul 2022 • Sarwan Ali
Since smartphones are easily available to every human being in the modern world, using them to track the human activities becomes possible.
1 code implementation • 18 Jul 2022 • Sarwan Ali, Bikram Sahoo, Alexander Zelikovskiy, Pin-Yu Chen, Murray Patterson
The rapid spread of the COVID-19 pandemic has resulted in an unprecedented amount of sequence data of the SARS-CoV-2 genome -- millions of sequences and counting.
no code implementations • 6 Jan 2022 • Sarwan Ali, Babatunde Bello, Prakash Chourasia, Ria Thazhe Punathil, Yijing Zhou, Murray Patterson
In coronaviruses, the surface (S) protein, or spike protein, is an important part of determining host specificity since it is the point of contact between the virus and the host cell membrane.
no code implementations • 18 Oct 2021 • Sarwan Ali, Yijing Zhou, Murray Patterson
Applying machine learning based algorithms to this big data is a natural approach to take to this aim, since they can quickly scale to such data, and extract the relevant information in the presence of variety and different levels of veracity.
1 code implementation • 18 Oct 2021 • Zahra Tayebi, Sarwan Ali, Murray Patterson
We then show that with the appropriate feature selection, we can efficiently and effectively cluster the spike sequences based on the different variants.
1 code implementation • 2 Oct 2021 • Sarwan Ali, Babatunde Bello, Zahra Tayebi, Murray Patterson
With the rapid spread of COVID-19 worldwide, viral genomic data is available in the order of millions of sequences on public databases such as GISAID.
no code implementations • 29 Sep 2021 • Sarwan Ali, Bikram Sahoo, Pin-Yu Chen, Murray Patterson
The rapid spread of the COVID-19 pandemic has resulted in an unprecedented amount of sequence data of the SARS-CoV-2 viral genome --- millions of sequences and counting.
no code implementations • 17 Sep 2021 • Inaam Ul Hassan, Abdul Haseeb, Sarwan Ali
An HDR image comprises multiple narrow-range-exposure images combined into one high-quality image.
1 code implementation • 12 Sep 2021 • Sarwan Ali, Murray Patterson
Through experiments, we show that Spike2Vec is not only scalable on several million spike sequences, but also outperforms the baseline models in terms of prediction accuracy, F1 score, etc.
no code implementations • 2 Sep 2021 • Zohair Raza Hassan, Sarwan Ali, Imdadullah Khan, Mudassir Shabbir, Waseem Abbas
Operating on edge streams allows us to avoid storing the entire graph in memory, and controlling the sample size enables us to keep the runtime of our algorithms within desired bounds.
no code implementations • 18 Aug 2021 • Sarwan Ali, Tamkanat-E-Ali, Muhammad Asad Khan, Imdadullah Khan, Murray Patterson
Using a $k$-mer based feature vector generation and efficient feature selection methods, our approach is effective in identifying variants, as well as being efficient and scalable to millions of sequences.
no code implementations • 7 Aug 2021 • Sarwan Ali, Bikram Sahoo, Naimat Ullah, Alexander Zelikovskiy, Murray Patterson, Imdadullah Khan
With the rapid spread of the novel coronavirus (COVID-19) across the globe and its continuous mutation, it is of pivotal importance to design a system to identify different known (and unknown) variants of SARS-CoV-2.
no code implementations • 2 Feb 2020 • Sarwan Ali, Haris Mansoor, Imdadullah Khan, Naveed Arshad, Safiullah Faizullah, Muhammad Asad Khan
However, these solutions are not fair in terms of electricity distribution.
no code implementations • 2 Feb 2020 • Asad Ullah, Sarwan Ali, Imdadullah Khan, Muhammad Asad Khan, Safiullah Faizullah
In this paper, we investigate the effect of the analysis window and feature selection on classification accuracy of different hand and wrist movements using time-domain features.
no code implementations • 28 Dec 2019 • Haris Mansoor, Sarwan Ali, Imdadullah Khan, Naveed Arshad, Muhammad Asad Khan, Safiullah Faizullah
A prominent feature of \textsc{fmf} is that it works at any level of user-specified granularity, both in the temporal (from a single hour to days) and spatial dimensions (a single household to groups of consumers).
no code implementations • 27 Dec 2019 • Sarwan Ali, Muhammad Ahmad, Umair ul Hassan, Muhammad Asad Khan, Shafiq Alam, Imdadullah Khan
Data analysis require a pairwise proximity measure over objects.
no code implementations • 27 Dec 2019 • Sarwan Ali, Muhammad Haroon Shakeel, Imdadullah Khan, Safiullah Faizullah, Muhammad Asad Khan
Predicting node attributes in such graphs is an important problem with applications in many domains like recommendation systems, privacy preservation, and targeted advertisement.