[POJ] DNA Sorting

简介:

Time Limit: 1000MS
Memory Limit: 10000K
Total Submissions: 83069
Accepted: 33428

Description
One measure of unsortedness in a sequence is the number of pairs of entries that are out of order with respect to each other. For instance, in the letter sequence DAABEC, this measure is 5, since D is greater than four letters to its right and E is greater than one letter to its right. This measure is called the number of inversions in the sequence. The sequence AACEDGG has only one inversion (E and D)—it is nearly sorted—while the sequence ZWQM has 6 inversions (it is as unsorted as can be—exactly the reverse of sorted).

You are responsible for cataloguing a sequence of DNA strings (sequences containing only the four letters A, C, G, and T). However, you want to catalog them, not in alphabetical order, but rather in order of sortedness, from most sorted to least sorted. All the strings are of the same length.

Input
The first line contains two integers: a positive integer n (0 < n <= 50) giving the length of the strings; and a positive integer m (0 < m <= 100) giving the number of strings. These are followed by m lines, each containing a string of length n.

Output
Output the list of input strings, arranged from most sorted'' toleast sorted”. Since two strings can be equally sorted, then output them according to the orginal order.

Sample Input

10 6
AACATGAAGG
TTTTGGCCAA
TTTGGCCAAA
GATCAGATTT
CCCGGGGGGA
ATCGATGCAT

Sample Output

CCCGGGGGGA
AACATGAAGG
GATCAGATTT
ATCGATGCAT
TTTTGGCCAA
TTTGGCCAAA

Source
East Central North America 1998

解题思路

首先求出逆序数,然后对逆序数进行排队,最后输出排序后的DNA序列。对于求逆序数,只需对序列进行遍历,找出本应出现在某字符之后,却出现在了它之前的字符数即可(顺序为ACGT)。

实现代码

#include <iostream>
#include <algorithm>
#include <cstring>
using namespace std;

typedef struct node
{
    char s[100];
    int value; //逆序对数
}DNA;

bool cmp(DNA d1, DNA d2)
{
    return d1.value < d2.value;
}

//获取逆序对数
int rev_cnt(char *ch)
{
    int cnt[4] = {0};
    int reverse_cnt = 0;
    int i = 0;
    while(ch[i])
    {
        switch(ch[i])
        {
        case 'A':
            cnt[0]++;
            reverse_cnt += cnt[1] + cnt[2] + cnt[3];
            break;
        case 'C':
            cnt[1]++;
            reverse_cnt += cnt[2] + cnt[3];
            break;
        case 'G':
            cnt[2]++;
            reverse_cnt += cnt[3];
            break;
        case 'T':
            cnt[3]++;
            break;
        }
        i++;
    }

    return reverse_cnt;
}

void sort(DNA *d, int len)
{
    for (int i = 0; i < len - 1; i++)
    {
        int index = i;
        for (int j = i + 1; j < len; j++)
        {
            if (d[j].value < d[index].value)
            {
                index = j;
            }
        }

        if (index != i)
        {
            //DNA temp;
            //memcpy(&temp, &d[i], sizeof(d[i]));
            //memcpy(&d[i], &d[index], sizeof(d[index]));
            //memcpy(&d[index], &temp, sizeof(temp));
            DNA temp = d[i];
            d[i] = d[index];
            d[index] = temp;
        }
    }
}

int main()
{
    int m, n;
    scanf("%d%d", &m, &n); //m为输入字符长度,n为数据组数
    DNA *d = new DNA[n];
    int i = 0;
    while(i < n)
    {
        scanf("%s", d[i].s);
        d[i].value = rev_cnt(d[i].s);
        i++;
    }

    sort(d, d + n, cmp);
    //sort(d, n);自定义排序函数
    for (int i = 0; i < n; i++)
    {
        cout<<d[i].s<<endl;
    }

    delete [] d;
    return 0;
}

排序部分可以用<algorithm>里面的sort,也可以是自定义的。输入部分如果将scanf改为cin.getline(d[i].s, m+1)则需要先用一个scanf("%c", &c)读取两个整数之后的回车符。


转载:http://blog.csdn.net/foreverling/article/details/44559603

目录
相关文章
|
存储
poj 3254 Corn Fields (状态压缩dp)
状态压缩dp其实就是用二进制来表示所有的状态,比如这题, 我们在某一行可以这样取0 1 0 1 1 0 1,用1代表取了,0代表没取,因为这点,它的数据量也限制在20以内,所有看到这样数据量的题目可以先考虑一下状态压缩dp。对于有多行的数据,所有状态的总数必然很庞大,而且不用特殊的方法想要存储这些状态是不太现实的。既然每个点只有这两种情况,我们可以用二进制的一位来表示,0 1 0 1 1 0 1就可以表示为二进制0101101也就是十进制的45,如果我们想要枚举6个点的所有状态,我们只需要从0到2^6取其二进制就可以了,并不会重复或是多余。
38 0
|
算法
The kth great number(小根堆思想,模板题)
The kth great number(小根堆思想,模板题)
55 0
|
开发框架 .NET
poj 3468 A Simple Problem with Integers线段树区间修改
题目意思很简单,有N个数,Q个操作, Q l r 表示查询从l到r 的和,C l r v 表示将从l到r 的值加上v,明显的线段树,不知道线段树的人肯定暴力,肯定超时,哈哈!!
36 0
华为机试HJ63:DNA序列
华为机试HJ63:DNA序列
|
人工智能
POJ 2299 Ultra-QuickSort(树状数组+离散化+求逆序数)
POJ 2299 Ultra-QuickSort(树状数组+离散化+求逆序数)
|
算法 测试技术
lecture 2.1 简单算法
(一)while循环 1. Convert the following into code that uses a while loop.
1132 0
|
人工智能 BI 机器学习/深度学习