Pentaho Data Integration支持準(zhǔn)備及融合數(shù)據(jù),為您的業(yè)務(wù)創(chuàng)建一幅完整的畫面,以進(jìn)行更好的分析。完整的數(shù)據(jù)集成平臺(tái)為任何來源的終端用戶提供精確的,可實(shí)時(shí)分析的數(shù)據(jù)。由于可視化工具消除了編碼并減小了復(fù)雜度,Pentaho將大數(shù)據(jù)和所有的數(shù)據(jù)源放在了商業(yè)和IT用戶最容易獲得的位置。
Pentaho data integration prepares and blends data to create a complete picture of your business that drives actionable insights. The complete data integration platform delivers accurate, "analytics ready" data to end users from any source. With visual tools to eliminate coding and complexity, Pentaho puts Big Data and all data sources at the fingertips of business and IT users alike.
需要培訓(xùn)、定制、外包?
請(qǐng)聯(lián)系我們!:800018081
慧都專業(yè)技術(shù)團(tuán)隊(duì)幫助您提高效率,節(jié)省成本,降低風(fēng)險(xiǎn)!
* 關(guān)于本產(chǎn)品的分類與介紹僅供參考,精準(zhǔn)產(chǎn)品資料以介紹為準(zhǔn),如需購(gòu)買請(qǐng)先行測(cè)試。
針對(duì)拖拽式開發(fā)的簡(jiǎn)單可視化設(shè)計(jì)器
開發(fā)人員使用可視化工具能限度的縮減代碼,并且達(dá)到更高的效率。

拖拽可視化設(shè)計(jì)方法
- 圖形提取-轉(zhuǎn)換-加載(ETL)工具,以常規(guī)方式來加載和處理大數(shù)據(jù)源。
- 豐富的預(yù)建組件庫(kù)能訪問和轉(zhuǎn)換來自廣泛數(shù)據(jù)源的數(shù)據(jù)。
- 可視化界面調(diào)用自定義代碼,分析圖像和視頻文件以創(chuàng)建有意義的元數(shù)據(jù)。
- 動(dòng)態(tài)轉(zhuǎn)換,使用變量決定映射域,驗(yàn)證和改進(jìn)規(guī)則。
- 集成調(diào)試器用以檢測(cè)和調(diào)試任務(wù)執(zhí)行過程。
零編碼要求的大數(shù)據(jù)集成
Pentaho直觀的工具加速了大數(shù)據(jù)分析方案的設(shè)計(jì)、開發(fā)和部署,速度提升了高達(dá)15倍。

大數(shù)據(jù)集成變得很容易
- 完整的可視化開發(fā)工具消除了SQL編碼或編寫MapReduce Java函數(shù)。
- 通過本地支持的Hadoop、NoSQL和分析數(shù)據(jù)庫(kù)可廣泛的鏈接到任何類型數(shù)據(jù)或數(shù)據(jù)源。
- 并行處理引擎確保高效的性能和企業(yè)可擴(kuò)展性。
- 支持提取和融合現(xiàn)有的多元數(shù)據(jù),以生成高質(zhì)量的實(shí)時(shí)分析數(shù)據(jù)。
本地靈活支持所有大數(shù)據(jù)源
深層本地連接和自適應(yīng)大數(shù)據(jù)數(shù)據(jù)層的結(jié)合,加速了對(duì)的Hadoop分布,NoSQL數(shù)據(jù)庫(kù)以及其他大數(shù)據(jù)源的訪問。

泛和最深層次的大數(shù)據(jù)支持
- 支持從Cloudera,Hortonworks、MapR到Intel等的Hadoop分布。
- 包含針對(duì)Cassandra、MongoDB等NoSQL數(shù)據(jù)庫(kù)的插件,也可以連接到Amazon Redshift和 Splunk等專業(yè)的數(shù)據(jù)商店。
- 當(dāng)使用新的版本和功能時(shí),自適應(yīng)大數(shù)據(jù)層為企業(yè)節(jié)省了大量的開發(fā)時(shí)間。
- 高度的靈活性,降低了大數(shù)據(jù)體系變化所帶來的風(fēng)險(xiǎn)和孤立區(qū)。
- 反饋和分析增加的用戶和機(jī)器數(shù)據(jù)的數(shù)量,包括網(wǎng)頁(yè)內(nèi)容、文檔、社交媒體和日志文件。
- 通過靈活的集群分布,可以將Hadoop數(shù)據(jù)任務(wù)集成到全面的IT/ETL/BI解決方案中。
- 支持并行批量數(shù)據(jù)加載工具,以高效的加載數(shù)據(jù)。
強(qiáng)大的管理
包含簡(jiǎn)單實(shí)時(shí)可用的功能,可完成大數(shù)據(jù)集成項(xiàng)目等相關(guān)操作。

易于使用的進(jìn)度管理
- 管理用戶和任務(wù)的安全權(quán)限。
- 從最近成功檢查點(diǎn)上重啟任務(wù),并從當(dāng)前失敗中回滾作業(yè)執(zhí)行。
- 集成了LDAP和Active Directory中現(xiàn)有的的安全術(shù)語(yǔ)。
- 設(shè)置用戶的操作權(quán)限: 讀取、執(zhí)行或創(chuàng)建。
- 進(jìn)度數(shù)據(jù)集成過程實(shí)現(xiàn)了有序的流程管理。
- 監(jiān)測(cè)和分析數(shù)據(jù)集成處理的性能。
數(shù)據(jù)剖析數(shù)據(jù)質(zhì)量信息
剖析數(shù)據(jù),并結(jié)合完整的數(shù)據(jù)管理功能保證了數(shù)據(jù)的質(zhì)量。

數(shù)據(jù)質(zhì)量管理
- 識(shí)別不遵守商業(yè)規(guī)則和標(biāo)準(zhǔn)的數(shù)據(jù)。
- 規(guī)范、驗(yàn)證和清除不一致的或冗余的數(shù)據(jù)。
- 借助人類推理和Melissa數(shù)據(jù)進(jìn)行數(shù)據(jù)質(zhì)量管理。

Simple Visual Designer for Drag and Drop Development
Empower developers with visual tools to minimize coding and achieve greater productivity.

Drag and Drop Visual Design Approach
- Graphical extract-transform-load (ETL) tool to load and process big data sources in familiar ways.
- Rich library of pre-built components to access and transform data from a full spectrum of sources.
- Visual interface to call custom code, analyze images and video files to create meaningful metadata.
- Dynamic transformations, using variables to determine field mappings, validation and enrichment rules.
- Integrated debugger for testing and tuning job execution.
Big Data Integration with Zero-Coding Required
Pentaho's intuitive tools accelerate the time it takes to design, develop and deploy big data analytics by as much as 15x.

Big Data Integration made easy
- Complete visual development tools eliminate coding in SQL or writing MapReduce Java functions.
- Broad connectivity to any type or source of data with native support for Hadoop, NoSQL and analytic databases.
- Parallel processing engine to ensure high performance and enterprise scalability.
- Extract and blend existing and diverse data to produce consistent high quality ready-to-analyze data.
Native and Flexible Support for all Big Data Sources
A combination of deep native connections and an adaptive big data data layer ensures accelerated access to the leading Hadoop distributions, NoSQL databases, and other big data stores.

Broadest and Deepest Big Data Support
- Support for latest Hadoop distributions from Cloudera, Hortonworks, MapR and Intel.
- Simple plugins to NoSQL databases such as Cassandra and MongoDB, as well as connections to specialized data stores like Amazon Redshift and Splunk.
- Adaptive big data layer saves enterprises considerable development time as they leverage new versions and capabilities.
- Greater flexibility, reduced risk, and insulation from changes in the big data ecosystem.
- Reporting and analysis on growing amounts of user and machine generated data, including web content, documents, social media and log files.
- Integration of Hadoop data tasks into overall IT/ETL/BI solutions with scalable distribution across the cluster.
- Support for parallel bulk data loader utilities for loading data with maximum performance.
Powerful Administration and Management
Simplified out-of-the-box capabilities to manage the operations in a data integration project.

Easy to Use Schedule Management
- Manage security privileges for users and roles.
- Restart jobs from last successful checkpoint and roll back job execution on failure.
- Integrate with existing security definitions in LDAP and Active Directory.
- Set permissions to control user actions: read, execute or create.
- Schedule data integration flows for organized process management.
- Monitor and analyze the performance of data integration processes.
Data Profiling and Data Quality
Profile data and ensure data quality with comprehensive capabilities for data managers.

Data Quality Management
- Identify data that fails to comply with business rules and standards.
- Standardize, validate, de-duplicate and cleanse inconsistent or redundant data.
- Manage data quality with partners such as Human Inference and Melissa Data.