Matches in Nanopublications for { ?s ?p ?o <https://w3id.org/np/RAuZdwi5v34TP1MWsdSxlqcFFrNgAsnNfPkF_bF-PjuGA/assertion>. }
- 5e018f3a-bc34-48db-b72e-f85214a13a9d author 0000-0003-2388-0744 assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 author 0000-0003-2388-0744 assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 author 0000-0003-2388-0744 assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f author 0000-0003-2388-0744 assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 author 0000-0003-2388-0744 assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 author 0000-0003-2388-0744 assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 author 0000-0003-2388-0744 assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c author 0000-0003-2388-0744 assertion.
- b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 author 0000-0003-2388-0744 assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea author 0000-0003-2388-0744 assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df description "# ERGA Protein-coding gene annotation workflow. Adapted from the work of Sagane Joye: https://github.com/sdind/genome_annotation_workflow ## Prerequisites The following programs are required to run the workflow and the listed version were tested. It should be noted that older versions of snakemake are not compatible with newer versions of singularity as is noted here: [https://github.com/nextflow-io/nextflow/issues/1659](https://github.com/nextflow-io/nextflow/issues/1659). `conda v 23.7.3` `singularity v 3.7.3` `snakemake v 7.32.3` You will also need to acquire a licence key for Genemark and place this in your home directory with name `~/.gm_key` The key file can be obtained from the following location, where the licence should be read and agreed to: http://topaz.gatech.edu/GeneMark/license_download.cgi ## Workflow The pipeline is based on braker3 and was tested on the following dataset from Drosophila melanogaster: [https://doi.org/10.5281/zenodo.8013373](https://doi.org/10.5281/zenodo.8013373) ### Input data - Reference genome in fasta format - RNAseq data in paired-end zipped fastq format - uniprot fasta sequences in zipped fasta format ### Pipeline steps - **Repeat Model and Mask** Run RepeatModeler using the genome as input, filter any repeats also annotated as protein sequences in the uniprot database and use this filtered libray to mask the genome with RepeatMasker - **Map RNAseq data** Trim any remaining adapter sequences and map the trimmed reads to the input genome - **Run gene prediction software** Use the mapped RNAseq reads and the uniprot sequences to create hints for gene prediction using Braker3 on the masked genome - **Evaluate annotation** Run BUSCO to evaluate the completeness of the annotation produced ### Output data - FastQC reports for input RNAseq data before and after adapter trimming - RepeatMasker report containing quantity of masked sequence and distribution among TE families - Protein-coding gene annotation file in gff3 format - BUSCO summary of annotated sequences ## Setup Your data should be placed in the `data` folder, with the reference genome in the folder `data/ref` and the transcript data in the foler `data/rnaseq`. The config file requires the following to be given: ``` asm: 'absolute path to reference fasta' snakemake_dir_path: 'path to snakemake working directory' name: 'name for project, e.g. mHomSap1' RNA_dir: 'absolute path to rnaseq directory' busco_phylum: 'busco database to use for evaluation e.g. mammalia_odb10' ```" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df description "# ERGA Protein-coding gene annotation workflow. Adapted from the work of Sagane Joye: https://github.com/sdind/genome_annotation_workflow ## Prerequisites The following programs are required to run the workflow and the listed version were tested. It should be noted that older versions of snakemake are not compatible with newer versions of singularity as is noted here: [https://github.com/nextflow-io/nextflow/issues/1659](https://github.com/nextflow-io/nextflow/issues/1659). `conda v 23.7.3` `singularity v 3.7.3` `snakemake v 7.32.3` You will also need to acquire a licence key for Genemark and place this in your home directory with name `~/.gm_key` The key file can be obtained from the following location, where the licence should be read and agreed to: http://topaz.gatech.edu/GeneMark/license_download.cgi ## Workflow The pipeline is based on braker3 and was tested on the following dataset from Drosophila melanogaster: [https://doi.org/10.5281/zenodo.8013373](https://doi.org/10.5281/zenodo.8013373) ### Input data - Reference genome in fasta format - RNAseq data in paired-end zipped fastq format - uniprot fasta sequences in zipped fasta format ### Pipeline steps - **Repeat Model and Mask** Run RepeatModeler using the genome as input, filter any repeats also annotated as protein sequences in the uniprot database and use this filtered libray to mask the genome with RepeatMasker - **Map RNAseq data** Trim any remaining adapter sequences and map the trimmed reads to the input genome - **Run gene prediction software** Use the mapped RNAseq reads and the uniprot sequences to create hints for gene prediction using Braker3 on the masked genome - **Evaluate annotation** Run BUSCO to evaluate the completeness of the annotation produced ### Output data - FastQC reports for input RNAseq data before and after adapter trimming - RepeatMasker report containing quantity of masked sequence and distribution among TE families - Protein-coding gene annotation file in gff3 format - BUSCO summary of annotated sequences ## Setup Your data should be placed in the `data` folder, with the reference genome in the folder `data/ref` and the transcript data in the foler `data/rnaseq`. The config file requires the following to be given: ``` asm: 'absolute path to reference fasta' snakemake_dir_path: 'path to snakemake working directory' name: 'name for project, e.g. mHomSap1' RNA_dir: 'absolute path to rnaseq directory' busco_phylum: 'busco database to use for evaluation e.g. mammalia_odb10' ``` " assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 description "# ERGA Protein-coding gene annotation workflow. Adapted from the work of Sagane Joye: https://github.com/sdind/genome_annotation_workflow ## Prerequisites The following programs are required to run the workflow and the listed version were tested. It should be noted that older versions of snakemake are not compatible with newer versions of singularity as is noted here: [https://github.com/nextflow-io/nextflow/issues/1659](https://github.com/nextflow-io/nextflow/issues/1659). `conda v 23.7.3` `singularity v 3.7.3` `snakemake v 7.32.3` You will also need to acquire a licence key for Genemark and place this in your home directory with name `~/.gm_key` The key file can be obtained from the following location, where the licence should be read and agreed to: http://topaz.gatech.edu/GeneMark/license_download.cgi ## Workflow The pipeline is based on braker3 and was tested on the following dataset from Drosophila melanogaster: [https://doi.org/10.5281/zenodo.8013373](https://doi.org/10.5281/zenodo.8013373) ### Input data - Reference genome in fasta format - RNAseq data in paired-end zipped fastq format - uniprot fasta sequences in zipped fasta format ### Pipeline steps - **Repeat Model and Mask** Run RepeatModeler using the genome as input, filter any repeats also annotated as protein sequences in the uniprot database and use this filtered libray to mask the genome with RepeatMasker - **Map RNAseq data** Trim any remaining adapter sequences and map the trimmed reads to the input genome - **Run gene prediction software** Use the mapped RNAseq reads and the uniprot sequences to create hints for gene prediction using Braker3 on the masked genome - **Evaluate annotation** Run BUSCO to evaluate the completeness of the annotation produced ### Output data - FastQC reports for input RNAseq data before and after adapter trimming - RepeatMasker report containing quantity of masked sequence and distribution among TE families - Protein-coding gene annotation file in gff3 format - BUSCO summary of annotated sequences ## Setup Your data should be placed in the `data` folder, with the reference genome in the folder `data/ref` and the transcript data in the foler `data/rnaseq`. The config file requires the following to be given: ``` asm: 'absolute path to reference fasta' snakemake_dir_path: 'path to snakemake working directory' name: 'name for project, e.g. mHomSap1' RNA_dir: 'absolute path to rnaseq directory' busco_phylum: 'busco database to use for evaluation e.g. mammalia_odb10' ```" assertion.
- Workflow-RO-Crate version "0.2.0" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 version "1" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df contentUrl "https://api.rohub.org/api/ros/7019724e-b5a0-4f7e-a7d6-a1baacac85df/crate/download/" assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c contentUrl "https://api.rohub.org/api/resources/043ddb69-1a1c-4eb4-8785-9e3803fa377c/download/" assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 contentUrl "https://api.rohub.org/api/resources/0cdb95da-31b8-40ad-a2f9-32154e335db2/download/" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 contentUrl "https://api.rohub.org/api/resources/11f3e069-8d1d-48da-876f-52fd6d255223/download/" assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 contentUrl "https://api.rohub.org/api/resources/1dd2de01-f16f-4704-853a-116e2de3ff65/download/" assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a contentUrl "https://api.rohub.org/api/resources/21e37e3d-1372-4acd-9ce7-b1a005e9a41a/download/" assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 contentUrl "https://api.rohub.org/api/resources/2dd40f39-d265-4afe-ab50-5238e7bd6b16/download/" assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa contentUrl "https://api.rohub.org/api/resources/52497c90-bae5-4fbd-aca5-b5175ff7a4fa/download/" assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d contentUrl "https://api.rohub.org/api/resources/5e018f3a-bc34-48db-b72e-f85214a13a9d/download/" assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 contentUrl "https://api.rohub.org/api/resources/6420e76a-1e0e-42a0-9b9e-e355de2a88a5/download/" assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 contentUrl "https://api.rohub.org/api/resources/64cb8be0-e907-41a0-8f6b-dbf642fe4372/download/" assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f contentUrl "https://api.rohub.org/api/resources/69f9d60c-925c-4b44-9b87-fc35c96eed2f/download/" assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 contentUrl "https://api.rohub.org/api/resources/6aadc530-3b55-44e0-8c8d-8762952df493/download/" assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 contentUrl "https://api.rohub.org/api/resources/70cec592-1fbe-4b53-a8ef-8dd3522246d6/download/" assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 contentUrl "https://api.rohub.org/api/resources/7d430516-d8cb-4bc0-a4d9-e103a63e3478/download/" assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c contentUrl "https://api.rohub.org/api/resources/89e1f03d-e2e9-46f1-8386-8cdeff35386c/download/" assertion.
- b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 contentUrl "https://api.rohub.org/api/resources/b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1/download/" assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea contentUrl "https://api.rohub.org/api/resources/d6d97bc9-1631-4347-920b-dec30f6aebea/download/" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df creator 0000-0003-2388-0744 assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c creator 0000-0003-2388-0744 assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 creator 0000-0003-2388-0744 assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 creator 554 assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 creator 0000-0003-2388-0744 assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a creator 0000-0003-2388-0744 assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 creator 0000-0003-2388-0744 assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa creator 0000-0003-2388-0744 assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d creator 0000-0003-2388-0744 assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 creator 0000-0003-2388-0744 assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 creator 0000-0003-2388-0744 assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f creator 0000-0003-2388-0744 assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 creator 0000-0003-2388-0744 assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 creator 0000-0003-2388-0744 assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 creator 0000-0003-2388-0744 assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c creator 0000-0003-2388-0744 assertion.
- b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 creator 0000-0003-2388-0744 assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea creator 0000-0003-2388-0744 assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df dateModified "2024-03-05 12:23:14.184651+00:00" assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c dateModified "2023-09-13 18:25:04.460641+00:00" assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 dateModified "2023-09-13 18:24:55.832618+00:00" assertion.
- 11f3e069-8d1d-48da-876f-52fd6d255223 dateModified "2023-09-13 18:25:04.032738+00:00" assertion.
- 1dd2de01-f16f-4704-853a-116e2de3ff65 dateModified "2023-09-13 18:24:55.014153+00:00" assertion.
- 21e37e3d-1372-4acd-9ce7-b1a005e9a41a dateModified "2023-09-13 18:24:56.003168+00:00" assertion.
- 2dd40f39-d265-4afe-ab50-5238e7bd6b16 dateModified "2023-09-13 18:24:57.288292+00:00" assertion.
- 52497c90-bae5-4fbd-aca5-b5175ff7a4fa dateModified "2023-09-13 18:25:00.419280+00:00" assertion.
- 5e018f3a-bc34-48db-b72e-f85214a13a9d dateModified "2023-09-13 18:24:56.445276+00:00" assertion.
- 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 dateModified "2023-09-13 18:24:56.846580+00:00" assertion.
- 64cb8be0-e907-41a0-8f6b-dbf642fe4372 dateModified "2023-09-13 18:24:54.812023+00:00" assertion.
- 69f9d60c-925c-4b44-9b87-fc35c96eed2f dateModified "2023-09-13 18:24:54.625563+00:00" assertion.
- 6aadc530-3b55-44e0-8c8d-8762952df493 dateModified "2023-09-13 18:24:56.211113+00:00" assertion.
- 70cec592-1fbe-4b53-a8ef-8dd3522246d6 dateModified "2023-09-13 18:24:55.550150+00:00" assertion.
- 7d430516-d8cb-4bc0-a4d9-e103a63e3478 dateModified "2023-09-13 18:25:00.644693+00:00" assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c dateModified "2023-09-13 18:24:53.979999+00:00" assertion.
- b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 dateModified "2023-09-13 18:25:00.849293+00:00" assertion.
- d6d97bc9-1631-4347-920b-dec30f6aebea dateModified "2023-09-13 18:24:54.412314+00:00" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df datePublished "2023-09-13 18:24:50.860910+00:00" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df encodingFormat "application/ld+json" assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c encodingFormat "text/html" assertion.
- 89e1f03d-e2e9-46f1-8386-8cdeff35386c encodingFormat "text/markdown" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df hasPart 17b98cad-68c7-4991-acb1-b922f0c3d44f assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df hasPart 28d4cc6a-cc26-4629-865c-951dcec04a63 assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df hasPart 72f47650-7679-4648-bb43-d28fa77665c4 assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df hasPart ef0d1b26-8b1c-4a45-9758-52610cb8e3a8 assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df hasPart 043ddb69-1a1c-4eb4-8785-9e3803fa377c assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df hasPart 11f3e069-8d1d-48da-876f-52fd6d255223 assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df hasPart 89e1f03d-e2e9-46f1-8386-8cdeff35386c assertion.
- 17b98cad-68c7-4991-acb1-b922f0c3d44f hasPart 2dd40f39-d265-4afe-ab50-5238e7bd6b16 assertion.
- 28d4cc6a-cc26-4629-865c-951dcec04a63 hasPart 52497c90-bae5-4fbd-aca5-b5175ff7a4fa assertion.
- 28d4cc6a-cc26-4629-865c-951dcec04a63 hasPart 7d430516-d8cb-4bc0-a4d9-e103a63e3478 assertion.
- 28d4cc6a-cc26-4629-865c-951dcec04a63 hasPart b2864dbb-5ce6-4ad9-8dcb-27a0325baeb1 assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 hasPart 0cdb95da-31b8-40ad-a2f9-32154e335db2 assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 hasPart 1dd2de01-f16f-4704-853a-116e2de3ff65 assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 hasPart 21e37e3d-1372-4acd-9ce7-b1a005e9a41a assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 hasPart 5e018f3a-bc34-48db-b72e-f85214a13a9d assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 hasPart 6420e76a-1e0e-42a0-9b9e-e355de2a88a5 assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 hasPart 64cb8be0-e907-41a0-8f6b-dbf642fe4372 assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 hasPart 69f9d60c-925c-4b44-9b87-fc35c96eed2f assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 hasPart 6aadc530-3b55-44e0-8c8d-8762952df493 assertion.
- 72f47650-7679-4648-bb43-d28fa77665c4 hasPart 70cec592-1fbe-4b53-a8ef-8dd3522246d6 assertion.
- ef0d1b26-8b1c-4a45-9758-52610cb8e3a8 hasPart d6d97bc9-1631-4347-920b-dec30f6aebea assertion.
- bts480 identifier "https://doi.org/10.1093/bioinformatics/bts480" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df identifier "https://w3id.org/ro-id/7019724e-b5a0-4f7e-a7d6-a1baacac85df" assertion.
- 0000-0003-4771-6113 identifier "https://orcid.org/0000-0003-4771-6113" assertion.
- 7019724e-b5a0-4f7e-a7d6-a1baacac85df license no-permission assertion.
- 043ddb69-1a1c-4eb4-8785-9e3803fa377c license no-permission assertion.
- 0cdb95da-31b8-40ad-a2f9-32154e335db2 license no-permission assertion.