读书人

诚心邀请请问数值分析或数据挖掘高手

发布时间: 2012-02-25 10:01:48 作者: rapoo

诚心邀请,请教数值分析或数据挖掘高手
大家好:最近老板布置一个任务,对我来说就是IM(impossible mission),:).如果大家有兴趣,我把任务描述一下:
ABCDERatio
0.0758.6411.9066.64982.729180
0.0979.0902.0266.85281.936192
0.0868.7361.9276.61882.633184
0.03414.5501.2434.97879.195283
0.04714.6351.2444.58679.487282
0.03414.7011.2495.03478.982286
0.03814.7611.2154.76779.220287
6.77562.0707.7668.67714.7127789
6.02759.5467.4388.51918.4705298
6.47761.4587.6888.81015.5676748
6.61762.0487.7798.89014.6667509
16.01745.2194.80511.90322.0563827
1.5182.4400.85910.29384.88967
1.2722.0800.7278.88987.03153
0.8391.3340.4777.09290.25734
16.01245.2044.80311.89922.0823827
0.0211.4150.9257.99289.64645
0.11510.9691.88310.73676.297207
0.10613.3792.64210.77073.104308
2.86479.3040.8351.06115.9366839
0.21135.7546.35711.47146.2071209
0.2339.9041.5994.12084.145187
0.2199.2931.4963.58685.407171
0.2189.5531.4953.46785.266179
0.22910.1381.6334.20083.801191
0.16097.6790.3620.1091.68971653
0.14797.2340.3700.9551.29496858
0.35273.0376.5797.02713.00410317
0.28373.8695.2155.96814.6658051
0.0290.0360.0100.71599.2101
0.0170.0000.0052.12997.8491
0.0260.0520.0120.52699.3842
2.66446.04510.28620.89020.1155238
6.57141.7579.68323.39318.5965240
7.30247.4448.01116.45020.7935305
13.84156.9752.3567.18319.6455357
2.79455.0968.78213.13720.1915380
0.94851.79710.09716.31420.8435393
10.56555.6404.4759.21420.1065421
7.97948.3968.21615.24920.1605443
21.45043.9079.0979.49716.0495449
11.56755.3982.7359.64120.6595536
0.82151.78410.14416.40320.8485553
0.27182.7897.3686.6572.91480925
0.74683.8026.7965.1863.47083683
0.15970.71610.49514.0684.56385783
2.05249.86417.24122.8597.98487673
0.15068.18210.19214.1077.36896611
0.5670.1500.0540.29698.9336
0.1705.4210.8257.30186.283114
0.0060.2710.2974.21695.21111
0.0809.4022.0009.64278.876210
6.55623.7213.4207.77858.525681
0.0000.8600.2300.84098.07013
0.0584.9810.5152.04992.39686
0.0620.4820.1761.75497.52615
0.0211.0270.9585.58892.40535
7.13111.7212.4017.18971.558309
0.13018.6614.80310.76665.639457
0.0340.9740.1644.20694.62320
0.07559.6395.06110.11225.1133453
0.3135.4472.0568.30683.877115
0.5473.3132.78411.03382.324115
0.0634.2141.79910.37683.548116
1.5315.6731.4694.10587.223117
1.3774.8971.3907.64284.694118
0.1715.6141.8237.34085.052120
1.3384.3791.6257.92984.730120
1.0894.8962.2039.56482.247122
0.5604.2292.82912.81479.567125
0.1035.7222.1478.61183.416125
1.2525.2031.7195.21186.615126
0.2434.7152.94110.74881.353127
0.6555.9831.7154.88786.759127
0.5026.8171.0502.24289.389128
0.1436.1391.9879.00282.729131
1.2635.8451.9458.77682.171132
0.8444.6762.67013.02378.787134
1.5695.7261.5657.92483.216136
1.4307.1571.5375.10984.767140
0.1717.6461.9584.23685.989141
1.5996.2901.8905.05785.165143
1.1416.6471.7318.00882.473143
0.3468.0491.4583.54186.606145
0.4298.3491.2273.97586.020149
0.8175.9663.15311.23478.830152
0.9525.5292.82913.21277.478155
1.9876.3591.8257.13982.690165
1.5607.1052.1805.38383.772171
1.6607.8572.12410.30378.057176
1.2558.3512.2455.51382.637179
0.0857.6872.4426.87982.907182
2.3158.0861.6297.79080.180185


2.1817.3191.9587.38581.157189
1.1167.6452.1907.98681.063192
0.7317.7313.5548.92779.057197
1.0588.2613.0757.77679.830199
0.2266.8984.38112.62875.867199
2.1118.7371.9548.55678.642206
0.9257.1884.02011.90275.965207
0.1049.0952.80413.17874.819212
2.0648.8302.1118.49678.499216
1.0256.3102.59018.51171.565230
0.2519.4553.19113.07074.034233
0.3969.5214.09311.90774.083243
0.1437.6604.19915.10072.898243
0.3658.8784.15512.33874.264243
1.1957.3254.57014.93171.979250
0.48811.7672.4589.34475.942264
0.6389.0483.8038.89577.615270
1.78011.3893.0258.54875.258273
1.39211.7023.7598.59874.549290
0.22811.6493.70711.77772.640309
1.0869.8127.06115.09066.951314
0.4655.3553.83815.88874.454321
2.46012.4713.2346.90474.931352
1.76512.9303.03012.10370.172394
3.99515.7345.77815.52958.964759
2.07722.96910.59718.76345.594982
1.90728.9748.07221.65139.3971516
3.97535.3713.70317.83639.1141712
1.75145.4936.25210.46336.0412237
1.07337.89512.88518.85729.2902684
3.65339.43413.14316.45027.3202819
0.66144.03510.54815.06729.6892887
5.92243.2616.76113.54330.5122900
7.12144.9187.71615.59524.6493480
0.70646.49511.00515.52026.2743621
4.28639.26110.85822.88122.7144011
20.68142.3028.8209.21818.9784194
0.73348.67011.46416.06123.0724660
21.12442.9738.9919.33917.5734733
4.17544.60714.67516.44520.0984737
0.63450.85110.24617.65820.6115817
0.60552.02210.18717.11220.0756009
0.75350.34511.80916.53920.5546073
7.53249.0528.26116.68018.4756726
4.24658.7518.09210.89718.0146776
2.08647.34414.56219.36116.6477277
9.70861.7383.5228.82616.2067421
2.05762.2587.60214.32313.7609028
7.83156.5649.22611.04015.33710402
10.59657.2977.62711.65012.83011377
11.52953.1579.97113.65311.69012061
13.01963.6863.0188.00112.27612331
1.92547.97116.00821.69812.39814964
0.69060.2086.39620.54612.16118153
2.98761.77012.10514.5428.59620237
5.53471.9387.0907.1328.30722529
0.74061.2186.52120.68010.84024038
3.57264.7859.32513.1769.14228533
0.58372.9839.1009.8057.52928868
3.76257.14211.21919.5408.33629069
0.62480.3886.5975.7456.64629484
2.32053.13116.13520.1308.28430613
4.62774.1737.2266.7007.27438765
3.43353.11814.65018.9169.88238941
1.13876.1197.7428.8966.10539845
3.61664.9089.76213.8807.83340383
3.87657.44211.70120.2166.76657401
0.00062.90320.50512.3554.23760305
6.31765.7529.71510.9377.27872520
3.79244.9509.02314.50227.7332848
1.8786.6780.7522.63088.061131
0.51582.3611.9514.35210.82111837
0.1609.8462.1517.89579.948211
0.35164.5313.5157.94323.6593989
8.9358.7481.7407.00973.568261
0.2522.5631.21213.88982.08479
4.6071.7161.5126.36985.79686
7.8877.7221.5366.19076.666211
0.10013.1742.60310.31173.811291

这是一张Excel表单,总共有几千行数据,我选取了一些具有代表性的行,粘贴出来,大家可以看到有A,B,C,D,E 5个字段,至于Ratio这个字段做什么用的,我问了老板,他的回答是:不知道。。。需求就是,假设有个字段P,如何根据每行的A[n],B[n],C[n],D[n],E[n]得到P[n],即是找出P = F(A,B,C,D,E),的这个F()然后得到P。坦白地说我真的是懵了。在我看来,更离谱的是,假如又有字段F,G等加入进来,这个F()要具有扩展性,即P = F(A,B,C,D,E,F,G), F依然适合。
我所知道的需求就这么多,问了一些,老板说他自己也不知道。
所以我想请教的是:1,这个是可以做的么;2,假如n-dimension的不可以做,那么现有的这个5-dimision,即P = F(A,B,C,D,E),这是可以做的么(我觉得这一部肯定是可以做的),如果能做,不知道是否能提供一些思路,相关书籍,给与指导; 3,如果不能做,那么我想知道为什么不能做。


十分感激各位的相助。小弟联系方式: mail,lixin19821010@163.com, MSN, lixin19821010@hotmail.com, QQ: 147370451


[解决办法]
1、可以。
2、可以。书籍你已经知道了,就是数据挖掘/数据仓库
3、如果不能做,是数据缺乏代表性,或者数据本身缺乏规律,只是随意的一些数字


提示下:其实F()就是神经网络——对这个相对简单的问题,直接用遗传算法也可以。

首先把神经网络/遗传算法大致搞懂,或者干脆直接依样画葫芦写一个都行,随即产生一个初始参数,然后把execl表格中的数据作为输入,让这个网络反复学习。(你必须知道对应每行记录的正确P值,否则无法反馈)。

经过对典型数据的反复学习——但须注意不可让网络学习过度——就可以建立针对这个数据集的神经网络了。将来任何新的数据输入,它就能立刻给出对应的P值。原始数据集越典型,它给出的P值就会越精确。

如果将来需要增加F的参数,只要扩展神经网络并重新学习即可(原来的学习成果可以保留)。

读书人网 >软件架构设计

热点推荐