Why the rush to find a replacement for Python?

thumbnail

Translator | Liu Tao

Artificial intelligence (AI) and machine learning (ML) are our everyday companions, and it is hard to imagine life without algorithms and statistical models. Whenever we hear the word machine learning, the first thing that comes to mind is Python. Because Python has long been the face of machine learning, it has played an important role in helping to implement the technical aspects of machine learning.

Python is probably the most polished language for machine learning, used by 48.24% of developers. Due to its powerful and convenient low-level packages and high-level API interfaces, Python has unparalleled advantages in all areas of scientific computing, but it also has some disadvantages when used for specific tasks. As a result, the need for an alternative language to Python is also becoming more common.

Currently, many emerging languages ​​are comparable or even better than Python in performance. So it is no longer the only option for machine learning. Scala, Julia, MQL5, and other languages ​​that are not supersets of Python can be used to develop and deliver machine learning applications. So machine learning is no longer limited to Python.

This article will explore the emergence of new machine learning languages ​​and how they are eroding Python market share and the future changes in machine learning.

  1. Disadvantages of Python ==========================

Let’s discuss some of the shortcomings of Python that developers and data scientists often face.

performance and speed

Python development is fast, but computation is very slow. It uses an interpreter instead of a compiler. It is slower than other data science programming languages ​​due to sequential code execution. Python is slower than C because C has many useful tools and libraries. Unless you master vectorized code, you will see how slow it is.

Cython is a superset of Python. It compiles Python source code into C language code. The most important thing is to provide C language-level computing speed without additional computing resources. Many programmers use Cython to write Python code that runs as fast as C, but is more concise and readable.

Design Constraints

Python is hugely flawed by design, caused by its dynamic typing. Usually dynamically typed variables in programming languages ​​do not require special declaration. Python uses duck typing, which is confusing. In duck style, an object's class or property is less important than its function. Duck types do not require type checking. You can also validate processes or properties.

Inefficient memory consumption

Python uses a lot of memory due to its huge structure. Also, Python data types are flexible in terms of memory consumption. It is not suitable for memory intensive tasks. Its flexible data types also lead to memory consumption. Therefore, memory management in Python becomes challenging when you create large and long-running Python systems.

Insufficient threads

Python threads are less efficient compared to other languages. Although Python is multi-threaded, only one thread can execute at a time. Therefore, multithreaded Python provides the processing efficiency of a single thread. Python has a Global Interpreter Lock (GIL) that only one thread is allowed to use at a time. Simultaneous running of multiple threads is only possible if Python uses native libraries. Also, Python's superset, Jython, enables multithreading, while Python doesn't.

  1. Why are data scientists looking for Python alternatives? ===========================================================

Python is really cool. But it can't be both The Flash and The Incredible Hulk at the same time! Someone might need The Flash, someone might need the Hulk. User needs are diverse. Likewise, Python is not designed for math and data science applications. It requires a third-party library like NumPy or Tensorflow, specialized for deep learning algorithms. Therefore, a data-specific programming language has advantages over Python.

For example, in the business world, machine learning requires speed and scalability to reduce execution time. You can't keep customers waiting too long. So you need faster ML languages ​​(general purpose functional programming languages) to execute. The faster the better.

Now, Python is fast, but what can be faster than Python? Yes, it's Julia, or Scala!

In this case, data scientists prefer to use a different programming language than Python.

  1. Python Alternatives for Data Science =======================================

There are several Python alternatives that perform on par or better than Python in various application environments. Here are some alternatives for Python data science:

(1)Julia

Julia is a new high-level dynamic programming language that is innovative, fast, and comfortable. It is a general-purpose language capable of writing various applications. It is worth mentioning that a large part of its software package ecosystem and functions are geared towards advanced numerical computing, which is very suitable for machine learning.

Julia is built on a variety of languages, including C and high-level dynamic languages ​​like Python, R, and MATLAB. Julia draws on the optional types, syntax, and features of these languages ​​because Julia's goal is to combine their strengths and eliminate their weaknesses.

The creators of Julia wanted to develop a language that could handle scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing as fast as C. However, the language they eventually developed was faster. Although Python is getting faster, Julia beats it.

Julia simplifies many mathematical notations used in machine learning. It provides diverse GPU programming packages such as ArrayFire, which enables the GPU to execute generic code. Each package has its own unique programming mode, for example, NVIDIA has CUDA.Jl, AMD has AMDGPU.jl and oneAPI.

Machine learning engineers can quickly deploy Julia in large clusters, benefiting from its powerful tools such as MLBase.jl, Flux.jl (deep learning), MLJ.j (general machine learning), and Knet.jl (deep learning). Flux is a high-speed deep machine learning library that comes preloaded with additional tools to maximize Julia's capabilities. Likewise, ScikitLearn.jl, TensorFlow.jl, and MXNet.jl for ML applications are also included.

Julia is good at solving complex computational problems. As a result, many prestigious universities, including Stanford University and Tokyo Metropolitan University, offer Julia courses. Its performance is also solid in comparison to Python.

(2)Scala

Numerous industry executives like Logan Kilpatrick believe that Julia is the future of machine learning and data science.

Main features of Julia:

  • Some of Julia's core features enable efficient data science computing:
  • Julia's operations are fast. It is 2-20 times faster than Python;
  • Flexible and rich library functions;
  • Built-in package manager;
  • The PyCall and CCall packages call Python directly;
  • Manage other processes using shell-like functionality;
  • Developed for parallel and distributed computing;
  • Automatically generate efficient code for various parameter types;
  • Free and open source software with MIT license.

Scala is a high-level programming language that supports both object-oriented and functional programming techniques. Martin Odersky created it and officially launched in June 2004. Scala is gaining popularity among developers and is leapfrogging today's technology.

Scala is a Java Virtual Machine (JVM) language compatible with Java applications and libraries. Because it lacks the concept of basic data, it is often considered a static language. Its multi-paradigm, multi-core architecture is complex.

Apache Spark is a powerful and fast real-time data streaming and data processing tool. Scala Spark facilitates complex mapping, ETL and large data processing tasks. Scala allows users to implement new Spark features by understanding the code. Scala's compatibility with Java enables programmers to quickly grasp object-oriented concepts.

SparkMLlib contains high-quality algorithms and excellent performance in Spark, making it a scalable machine learning library. Classification, regression, clustering, collaborative filtering, and dimensionality reduction are all included in Spark's MLlib machine learning library. Additionally, Breeze, Spire, Saddle, and Scalalab are some other ML libraries that help create powerful data processing applications.

Apache Kafka is an open source distributed event streaming platform, and with the popularity of Python, Spark has also become popular. Things that execute slowly in Scala Spark are bound to crash in PySpark. Additionally, you can use the TensorFlow Scala library to create an adaptable, high-performance serving system for machine learning models.

Scala's static typing is helpful for complex applications. Its JVM and JavaScript runtimes enable high-performance systems to easily access libraries. Also, it outperforms Python. It also holds its own in comparisons to Python.

Key Features of Scala

Some of Scala's core features enable efficient data science computing:

  • Scala is 10 times faster than Python;
  • Has advanced type inference mechanism;
  • case classes and pattern matching;
  • Has a concise but readable expressive syntax;
  • Provide higher-order functions;
  • Use Singleton objects instead of static variables;
  • Use OOP, FP, or mixed-style code;
  • Scala source code is compiled into ".class" files, which are executed on the JVM.

(3)MQL5

MQL5 is a high-level object-oriented programming language that provides advanced data analysis and machine learning capabilities. It is based on the widely used and well-known programming language C++ and is known for its speed and versatility.

MQL5 is not an everyday ML language like Python, Julia or Scala. It was created specifically for financial markets to monitor financial instruments. The core of the language is similar to other languages, but with unique features. MQL5 supports integers, booleans, literals, strings, dates, times and enums. It defines both structures and classes as complex data types.

The MQL5 language documentation lists functions, operations, reserved words, etc. In addition, it provides data types such as integer, boolean, literal, string, date and time. It has complex structures and classes. In addition, the documentation includes standard library class definitions for trading strategies, control panels, custom visual effects, file access, and more.

In addition, MQL5 has more than 1,500 source code repositories for new application development. You can use the ALGLIB library which contains a large number of numerical analysis functions. Likewise, there are the TimeSeries library for working with time series and the Fuzzy library for developing fuzzy models and various other libraries.

MQL5 is a powerful programming language that helps you build real-time systems, providing visual decision aids. MQL5 supports enumerations, structures, classes and events. By increasing the number of main embedded functions, MQL5 programs can communicate with DLLs.

MQL5 has a syntax similar to C++, which makes it easy to convert programs written in different programming languages ​​into MQL5 programs. So you can get the same efficiency as C++ when using data analytics, artificial intelligence or financial tools as a trading bot.

Main features of MQL5

Some core features of MQL5 enable efficient data science computing:

  • MQL5 is based on C++. Therefore, its speed is comparable to C++ and surpasses Python;
  • Created for tools and analysis of financial markets;
  • fully event driven;
  • 1500+ source code repositories;
  • Standard constants and enumerations can be predefined. The service architecture stores information;
  • Change color schemes, create dashboards, add custom symbols and export price charts from MQL5 programs;
  • 12 new drawing styles, 512 buffers, and direct index value calculation from past to future;
  • Debug Forex Expert Advisors in Charts and Multi-Currency Tester
  1. Which Python ML alternative should you choose? =================================================

Programming languages ​​are like driving a racing car. A proper car is essential, not to mention a driver. Again, as a data scientist, you are the driver of these unique sports cars.

Every data scientist or developer has its own requirements and specifications for a specific technology. The best technology choice depends on the viewing angle and functionality you need. Therefore, it is difficult to make the final choice.

If you want to implement multi-process computing and require very high computing speed, you can choose Julia. It is the fastest of the three. On the other hand, if you're doing large projects, Scala should be the better choice because it does the job. If you are interested in financial instruments, MQL5 is the best choice.

Likewise, each programming language has its own characteristics and areas of expertise. Therefore, it is best practice to choose a language that meets your requirements and specifications.

  1. Summary ==========

Machine learning is an ongoing, evolving process. Over time, it emerged with the Standardized ML (SML) language and became popular in Python. Today, there are diverse developments in machine learning languages ​​with different needs.

In short, Python is a very good machine learning language. But Python is not a data science specific language, and there are some more modern programming languages ​​that can perform specific tasks more efficiently than it.

Hence, emerging programming languages ​​are gaining market share and growing in the field of machine learning. Hopefully, in the next few years, they will also become popular.

Original link: https://hackernoon.com/not-only-python-problems-errors-and-alternatives

Translator introduction

Liu Tao, community editor, is in charge of the system online inspection and control of a large central enterprise. The main responsibilities are to strictly review the system online acceptance inspection, penetration testing, baseline inspection and other inspections. He has many years of network security management experience, many years of PHP and Web development and defense experience, Linux use and management experience, rich experience in code auditing, network security testing and threat mining. Proficient in SQL auditing under Kali, SQLMAP automatic detection, XSS auditing, Metasploit auditing, CSRF auditing, webshell auditing, maltego auditing and other technologies.

Latest Programming News and Information | GeekBar

Related Posts